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Title of the Invention 

METHOD FOR INTEGRATING GENES AT SPECIFIC SITES IN MAMMALIAN CELLS VIA 
HOMOLOGOUS RECOMBINATION AND VECTORS FOR ACCOMPLISHING THE SAME 

5 

Figld of the Tnvention 

The present invention relates to a process of tar- 
geting the integration of a desired exogenous DNA to a 
specific location within the genome of a mammalian cell. 

10 More specifically, the invention describes a novel meth- 
od for identifying a transcriptionally active target 
site ("hot spot") in the mammalian genome, and inserting 
a desired DNA at this site via homologous recombination. 
The invention also optionally provides the ability for 

15 gene amplification of the desired DNA at this location 
by co- integrating an amplifiable selectable marker, 
e.g., DHFR, in combination with the exogenous DNA. The 
invention additionally describes the construction of 
novel vectors suitable for accomplishing the above, and 

20 further provides mammalian cell lines produced by such 

methods which contain a desired exogenous DNA integrated 
at a target hot spot . 
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Background 

Technology for expressing recombinant proteins in 
both prokaryotic and eukaryotic organisms is well estab- 
lished. Mammalian cells offer significant advantages 
5 over bacteria or yeast for protein, production, resulting 
from their ability to correctly assemble, glycosylate 
and post-translationally modify recombinant ly expressed 
proteins. After transfection into the host cells, 
recombinant expression constructs can be maintained as 

10 extrachromosomal elements, or may be integrated into the 
host cell genome. Generation of stably transfected 
mammalian cell lines usually involves the latter; a DNA 
construct encoding a gene of interest along with a drug 
resistance gene (dominant selectable marker) is intro- 

15 duced into the host cell, and subsequent growth in the 
presence of the drug allows for the selection of cells 
that have successfully integrated the exogenous DNA. In 
many instances, the gene of interest is linked to a drug 
resistant selectable marker which can later be subjected 

20 to gene amplification. The gene encoding dihydrof olate 
reductase (DHFR) is most commonly used for this purpose. 
Growth of cells in the presence of methotrexate, a com- 
petitive inhibitor of DHFR, leads to increased DHFR 
production by means of amplification of the DHFR gene. 

25 As flanking regions of DNA will also become amplified, 
the resultant coamplif ication of a DHFR linked gene in 
the transfected cell line can lead to increased protein 
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production, thereby resulting in high level expression 
of the gene of interest". 

While this approach has proven successful, there 
are a number of problems with the system because of the 
5 random nature of the integration event. These problems 
exist because expression levels are greatly influenced 
by the effects of the local genetic environment at the 
gene locus, a phenomena well documented in the litera- 
ture and generally referred to as "position effects" 

10 (for example, see Al-Shawi et al, Mol. Cell. Biol., 

10:1192-1198 (1990); Yoshimura et al, Mol. Cell. Biol., 
7:1296-1299 (1987)). As the vast majority of mammalian 
DNA is in a transcriptionally inactive state, random 
integration methods offer no control over the 

15 transcriptional fate of the integrated DNA. 

Consequently, wide variations in the expression level 
of integrated genes can occur, depending on the site of 
integration. For example, integration of exogenous DNA 
into inactive, or transcriptionally "silent" regions of 

20 the genome will result in little or no expression. By 
contrast integration into a transcriptionally active 
site may result in high expression. 

Therefore, when the goal of the work is to obtain a 
high level of gene expression, as is typically the de- 

25 sired outcome of genetic engineering methods, it is 

generally necessary to screen large numbers of transfec- 
tants to find such a high producing clone. 
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Additionally, random integration of exogenous DNA into 
the genome can in some instances disrupt important 
cellular genes, resulting in an altered phenotype. 
These factors can make the generation of high expressing 
5 stable mammalian cell lines a complicated and laborious 
process . 

Recently, our laboratory has described the use of 
DNA vectors containing translationally impaired dominant 
selectable markers in mammalian gene expression. (This 

10 is disclosed in U.S. Serial No. 08/147,696 filed Novem- 
ber 3, 1993, recently allowed). 

These vectors contain a translationally impaired 
neomycin phosphotransferase (neo) gene as the dominant 
selectable marker, artificially engineered to contain an 

15 intron into which a DHFR gene along with a gene or genes 
of interest is inserted. Use of these vectors as ex- 
pression constructs has been found to significantly 
reduce the total number of drug resistant colonies pro- 
duced, thereby facilitating the screening procedure in 

20 relation to conventional mammalian expression vectors. 
Furthermore, a significant percentage of the clones 
obtained using this system are high expressing clones. 
These results are apparently attributable to the 
modifications made to the neo selectable marker. Due to 

25 the translational impairment of the neo gene, 

transfected cells will not produce enough neo protein to 
survive drug selection, thereby decreasing the overall 
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number of drug resistant colonies. Additionally, a 
higher percentage of the surviving clones will contain 
the expression vector integrated into sites in the 
genome where basal transcription levels are high, 

.5 resulting in overproduction of neo, thereby allowing the 
cells to overcome the impairment of the neo gene. 
Concomitantly, the genes of interest linked to neo will 
be subject to similar elevated levels of transcription. 
This same advantage is also true as* a result of the 

10 artificial intron created within neo; survival is 

dependent on the synthesis of a functional neo gene, 
which is in turn dependent on correct and efficient 
splicing of the neo introns. Moreover, these criteria 
are more likely to be met if the vector DNA has 

15 integrated into a region which is already highly 
transcriptionally active. 

Following integration of the vector into a tran- 
scriptionally active region, gene amplification is per- 
formed by selection for the DHFR gene. Using this sys- 

20 tern, it has been possible to obtain clones selected 

using low levels of methotrexate (50nM) , containing few 
(<10) copies of the vector which secrete high levels of 
protein (>55pg/cell/day) . Furthermore, this can be 
achieved in a relatively short period of time. However, 

25 the success in amplification is variable. Some 

transcriptionally active sites cannot be amplified and 
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therefore the frequency and extent of amplification from 
a particular site is not predictable. . 

Overall, the use of these translationally impaired 
vectors represents a significant improvement over other 
5 methods of random integration. However, as discussed, 

the problem of lack of control over the integration site 
remains a significant concern. 

One approach to overcome the problems of random 
integration is by means of gene targeting, whereby the 

10 exogenous DNA is directed to a specific locus within the 
host genome. The exogenous DNA is inserted by means of 
homologous recombination occurring between sequences of 
DNA in the expression vector and the corresponding ho- 
mologous sequence in the genome. However, while this 

15 type of recombination occurs at a high frequency natu- 
rally in yeast and other fungal organisms, in higher 
eukaryotic organisms it is an extremely rare event.. In 
mammalian cells, the frequency of homologous versus non- 
homologous (random integration) recombination is report - 

20 ed to range from 1/100 to 1/5000 (for example, see 
Capecchi, Science, 244:1288-1292 (1989); Morrow and 
Kucherlapati, Curr. Op. Biotech., 4:577-582 (1993)). 

One of the earliest reports describing homologous 
recombination in mammalian cells comprised an artificial 

25 system created in mouse fibroblasts (Thomas et al, Cell, 
44:419-428 (1986)). A cell line containing a mutated, 
non- functional version of the neo gene integrated into 
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the host genome was created, and subsequently targeted 
with a second non- functional copy of neo containing a 
different mutation. Reconstruction of a functional neo 
gene could occur only by gene targeting. Homologous 
5 recombinants were identified by selecting for G418 

resistant cells, and confirmed by analysis of genomic 
DNA isolated from the resistant clones. 

Recently, the use of homologous recombination to 
replace the heavy and light immunoglobulin genes at 

10 endogenous loci in antibody secreting cells has been 
reported. (U.S. Patent No. 5,202,238, Fell et al, 
(1993) .) However, this particular approach is not 
widely applicable, because it is limited to the 
production of immunoglobulins in cells which 

15 endogenously express immunoglobulins, e.g., B cells and 
myeloma cells. Also, expression is limited to single 
copy gene levels because co-amplification after 
homologous recombination is not included. The method is 
further complicated by the fact that two separate 

20 integration events are required to produce a functional 
immunoglobulin: one for the light chain gene followed by 
one for the heavy chain gene. 

An additional example of this type of system has 
been reported in NS/0 cells, where recombinant 

25 immunoglobulins are expressed by homologous 

recombination into the immunoglobulin gamma 2 A locus 
(Hollis et al, international patent application # 
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PCT/IB95 (00014).) Expression levels obtained .from this 
site were extremely high - on the order of 20pg/cell/day 
from a single copy integrant. However, as in the above 
example, expression is limited to this level because an 
5 amplif iable gene is not contegrated in this system. 
Also, other researchers have reported aberrant 
glycosylation of recombinant proteins expressed in NS/0 
cells (for example, see Flesher et al, Biotech, and 
Bioeng. , 48:399-407 (1995)), thereby limiting the 

10 applicability of this approach. 

The cre-loxP recombination system from 
bacteriophage PI has recently been adapted and used as a 
means of gene targeting in eukaryotic cells. 
Specifically, the site specific integration of exogenous 

15 DNA into the Chinese hamster ovary (CHO) cell genome 
using ere recombinase and a series of lox containing 
vectors have been described. (Fukushige and Sauer, 
Proc. Natl. Acad. Sci. USA, 89:7905-7909 (1992).) This 
system is attractive in that it provides for 

2 0 reproducible expression at the same chromosomal 

location. However, no effort was made to identify a 
chromosomal site from which gene expression is optimal, 
and as in the above example, expression is limited to 
single copy levels in this system. Also, it is 

25 complicated by the fact that one needs to provide for 
expression of a functional recombinase enzyme in the 
mammalian cell . 
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The use of homologous recombination between an 
introduced DNA sequence, and its endogenous chromosomal 
locus has also been reported to provide a useful means 
of genetic manipulation in mammalian cells, as well as 
5 in yeast cells. (See e.g., Bradley et al, Meth. 
Enzymol., 223:855-879 (1993); Capecchi, Science, 
244:1288-1292 (1989); Rothstein et al, Afeth. Enzymol., 
194:281-301 (1991)). To date, most mammalian gene 
targeting studies have been directed toward gene 

10 disruption ("knockout") or site-specific mutagenesis of 
selected target gene loci in mouse embryonic stem (ES) 
cells. The creation of these "knockout" mouse models 
has enabled scientists to examine specific 
structure -function issues and examine the biological 

15 importance of a myriad of mouse genes. This field of 
research also has important implications in terms of 
potential gene therapy applications. 

Also, vectors have recently been reported by Cell- 
tech (Kent, U.K.) which purportedly are targeted to 

20 transcriptionally active sites in NSO cells, which do 
not require gene amplification (Peakman et al, Hum. 
Antibod. Hybridomas, 5:65-74 (1994)). However, levels 
of immunoglobulin secretion in these unamplified cells 
have not been reported to exceed 20pg/cell/day, while in 

25 amplified CHO cells, levels as high as lOOpg/cell/day 
can be obtained ( Id. ) . 
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It would be highly desirable to develop a gene 
targeting system which reproducibly provided for the 
integration of exogenous DNA into a predetermined site 
in the genome known to be transcriptionally active. 
5 Also, it would be desirable if such a gene targeting 

system would further facilitate co-amplification of the 
inserted DNA after integration. The design of such a 
system would allow for the reproducible and high level 
expression of any cloned gene of interest in a mammalian 

10 cell, and undoubtedly would be of significant interest 
to many researchers. 

In this application, we provide a novel mammalian 
expression system, based on homologous recombination 
occurring between two artificial substrates contained in 

15 two different vectors. Specifically, this system uses a 
combination of two novel mammalian expression vectors, 
referred to as a "marking" vector and a "targeting" 
vector . 

Essentially, the marking vector enables the identi- 
20 fication and marking of a site in the mammalian genome 

which is transcriptionally active, i.e., a site at which 
gene expression levels are high. This site can be 
regarded as a "hot spot" in the genome. After integra- 
tion of the marking vector, the subject expression sys- 
25 tern enables another DNA to be integrated at this site, 
i.e., the targeting vector, by means of homologous 
recombination occurring between DNA sequences common to 
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both vectors. This system affords significant 
advantages over other homologous recombination systems. 

Unlike most other homologous systems employed in 
mammalian cells, this system exhibits no background. 
5 Therefore, cells which have only undergone random inte- 
gration of the vector do not survive the selection. 
Thus, any gene of interest cloned into the targeting 
plasmid is expressed at high levels from the marked hot 
spot. Accordingly, the subject method of gene expres- 

10 sion substantially or completely eliminates the problems 
inherent to systems of random integration, discussed in 
detail above. Moreover, this system provides reproduc- 
ible and high level expression of any recombinant pro- 
tein at the same transcriptionally active site in the 

15 mammalian genome. In addition, gene amplification may 
be effected at this particular transcriptionally active 
site by including an amplifiable dominant selectable 
marker (e.g. DHFR) as part of the marking vector. 

Objects of the Invention 

20 Thus, it is an object of the invention to provide 

an improved method for targeting a desired DNA to a 
specific site in a mammalian cell. 

It is a more specific object of the invention to 
provide a novel method for targeting a desired DNA to a 

25 specific site in a mammalian cell via homologous recom- 
bination. 
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It is another specific object of the invention to 
provide novel vectors for achieving site specific inte- 
gration of a desired DNA in a mammalian cell. 

It is still another object of the invention to 
5 provide novel mammalian cell lines which contain a de- 
sired DNA integrated at a predetermined site which pro- 
vides for high expression. 

It is a more specific object of the invention to 
provide a novel method for achieving site specific inte- 
10 gration of a desired DNA in a Chinese hamster ovary 
(CHO) cell. 

It is another more specific object of the invention 
to provide a novel method for integrating immunoglobulin 
genes, or any other genes, in mammalian cells at 
15 predetermined chromosomal sites that provide for high 
expression. 

It is another specific object of the invention to 
provide novel vectors and vector combinations suitable 
for integrating immunoglobulin genes into mammalian 
20 cells at predetermined sites that provide for high ex- 
pression. 

It is another object of the invention to provide 
mammalian cell lines which contain immunoglobulin genes 
integrated at predetermined sites that provide for high 
25 expression. 

It is an even more specific object of the invention 
to provide a novel method for integrating immunoglobulin 
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genes into CHO cells that provide for high expression, 
as well as novel vectors and vector combinations that 
provide for such integration of immunoglobulin genes 
into CHO cells. 

.5 In addition, it is a specific object of the inven- 

tion to provide novel CHO cell lines which contain immu- 
noglobulin genes integrated at predetermined sites that 
provide for high expression, and have been amplified by 
methotrexate selection to secrete even greater amounts 
10 of functional immunoglobulins. 

Brief Description of the Figure 

Figure 1 depicts a map of a marking plasmid accord- 
ing to the invention referred to as Desmond. The plas- 
mid is shown in circular form (la) as well as a 

15 linearized version used for transfection (lb) . 

Figure 2 (a) shows a map of a targeting plasmid 
referred to "Molly". Molly is shown here encoding the 
anti-CD20 immunoglobulin genes, expression of which is 
described in Example 1 . 

20 Figure 2(b) shows a linearized version of Molly, 

after digestion with the restriction enzymes iCpnl and 
Pad. This linearized form was used for transfection. 

Figure 3 depicts the potential alignment between 
Desmond sequences integrated into the CHO genome, and 

!5 incoming targeting Molly sequences. One potential ar- 
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rangement of Molly integrated into Desmond after homolo- 
gous recombination is also presented. 

Figure 4 shows a Southern analysis of single copy 
Desmond clones. Samples are as follows: 
5 Lane 1: XHindlll DNA size marker 
Lane 2 : Desmond clone 10F3 
Lane 3 : Desmond clone 10C12 
Lane 4 : Desmond clone 15C9 
Lane 5: Desmond clone 14B5 
10. Lane 6: Desmond clone 9B2 

Figure 5 shows a Northern analysis of single copy 
Desmond clones. Samples are as follows: Panel A: 
northern probed with CAD and DHFR probes, as indicated 
on the figure. Panel B: duplicate northern, probed with 
15 CAD and HisD probes, as indicated. The RNA samples 
loaded in panels A and B are as follows: 
Lane 1: clone 9B2, lane 2; clone 10C12, lane 3; clone 
14B5, lane 4; clone 15C9, lane 5; control RNA from CHO 
transfected with a HisD and DHFR containing plasmid, 
20 lane 6; untransf ected CHO. 

Figure 6 shows a Southern analysis of clones 
resulting from the homologous integration of Molly into 
Desmond. Samples are as follows: 

Lane 1: XHindlll DNA size markers, Lane 2: 20F4, lane 3; 
25 5F9, lane 4; 21C7, lane 5; 24G2, lane 6; 25E1, lane 7; 
28C9, lane 8; 29F9, lane 9; 39G11, lane 10; 42F9, lane 
11; 50G10, lane 12; Molly plasmid DNA, linearized with 
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BgllKtop band) and cut with Bglll and Kpnl (lower 
band), lane 13; untransf ected Desmond. 

Figures 7A through 7G contain the Sequence Listing 
for Desmond. 

5 Figures 8A through 81 contain the Sequence Listing 

for Molly- containing anti-CD20. 

Figure 9 contains a map of the targeting plasmid, 
"Mandy, " shown here encoding anti-CD23 genes, the 
expression of which is disclosed in Example 5. 
10 Figures 10A through ION contain the sequence 

listing of "Mandy" containing the anti-CD23 genes as 
disclosed in Example 5 . 

Detailed Descripti on of the Invent- j on 

The invention provides a novel method for integrat- 

15 ing a desired exogenous DNA at a target site within the 
genome of a mammalian cell via homologous recombination. 
Also, the invention provides novel vectors for achieving 
the site specific integration of a DNA at a target site 
in the genome of a mammalian cell. 

20 More specifically, the subject cloning method pro- 

vides for site specific integration of a desired DNA in 
a mammalian cell by transfection of such cell with a 
"marker plasmid" which contains a unique sequence that 
is foreign to the mammalian cell genome and which 

25 provides a substrate for homologous recombination, fol- 
lowed by transfection with a "target plasmid" containing 
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a sequence which provides for homologous recombination 
with the unique sequence contained in the marker 
plasmid, and further comprising a desired DNA that is to 
be integrated into the mammalian cell. Typically, the 
5 integrated DNA will encode a protein of interest, such 
as an immunoglobulin or other secreted mammalian 
glycoprotein. 

The exemplified homologous recombination system 
uses the neomycin phosphotransferase gene as a dominant 
10 selectable marker. This particular marker was utilized 
based on the following previously published observa- 
tions; 

(i) the demonstrated ability to target and restore 
function to a mutated version of the neo gene (cited 

15 earlier) and 

(ii) our development of translationally impaired 
expression vectors, in which the neo gene has been arti- 
ficially created as two exons with a gene of interest 
inserted in the intervening intron; neo exons are cor- 

20 rectly spliced and translated in vivo, producing a func- 
tional protein and thereby conferring G418 resistance on 
the resultant cell population. In this application, the 
neo gene is split into three exons. The third exon of 
neo is present on the "marker" plasmid and becomes inte- 

25 grated into the host cell genome upon integration of the 
marker plasmid into the mammalian cells. Exons 1 and 2 
are present on the targeting plasmid, and are separated 
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by an intervening intron into which at least one gene of 
interest is cloned. Homologous recombination of the 
targeting vector with the integrated marking vector 
results in correct splicing of all three exons of the 
5 neo gene and thereby expression of a functional neo 

protein (as determined by selection for G418 resistant 
colonies) . Prior to designing the current expression 
system, we had experimentally tested the functionality 
of such a triply spliced neo construct in mammalian 
10 cells. The results of this control experiment indicated 
that all three neo exons were properly spliced and 
therefore suggested the feasibility of the subject 
invention. 

However, while the present invention is exemplified 
15 using the neo gene, and more specifically a triple split 
neo gene, the general methodology should be efficacious 
with other dominant selectable markers. 

As discussed in greater detail infra, the present 
invention affords numerous advantages to conventional 
20 gene expression methods, including both random integra- 
tion and gene targeting methods. Specifically, the 
subject invention provides a method which reproducibly 
allows for site-specific integration of a desired DNA 
into a transcriptionally active domain of a mammalian 
25 cell. Moreover, because the subject method introduces 
an artificial region of "homology" which acts as a 
unique substrate for homologous recombination and the 
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insertion of a desired DNA, the efficacy of subject 
invention does not require that the cell endogenously 
contain or express a specific DNA. Thus, the method is 
generically applicable to all mammalian cells, and can. 
5 be used to express any type of recombinant protein. 

The use of a triply spliced selectable marker, . 
e.g., the exemplified triply spliced neo construct, 
guarantees that all G418 resistant colonies produced 
will arise from a homologous recombination event (random 

10 integrants will not produce a functional neo gene and 
consequently will not survive G418 selection) . Thus, 
the subject invention makes it easy to screen for the 
desired homologous event. Furthermore, the frequency of 
additional random integrations in a cell that has under- 

15 gone a homologous recombination event appears to be low. 

Based on the foregoing, it is apparent that a sig- 
nificant advantage of the invention is that it substan- 
tially reduces the number of colonies that need be 
screened to identify high producer clones, i.e.,- cell 

2 0 lines containing a desired DNA which secrete the corre- 
sponding protein at high levels. On average, clones 
containing integrated desired DNA may be identified by 
screening about 5 to 20 colonies (compared to several 
thousand which must be screened when using standard 

25 random integration techniques, or several hundred using 
the previously described intronic insertion vectors) 
Additionally, as the site of integration was preselected 
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and comprises a transcriptionally active domain, all 
exogenous DNA expressed at this site should produce 
comparable, i.e. high levels of the protein of interest. 
Moreover, the subject invention is further advanta- 
5 geous in that it enables an amplifiable gene to be 

inserted on integration of the marking vector. Thus, 
when a desired gene is targeted to this site via 
homologous recombination, the subject invention allows 
for expression of the gene to be further enhanced by 
10 gene amplification. In this regard, it has been 

reported in from the literature that different genomic 
sites have different capacities for gene amplification 
(Meinkoth et al , Mol. Cell Biol,, 7:1415-1424 (1987)). 
Therefore, this technique is further advantageous as it 
15 allows for the placement of a desired gene of interest 

at a specific site that is both transcriptionally active 
and easily amplified. Therefore, this should signifi- 
cantly reduce the amount of time required to isolate 
such high producers . 
20 Specifically, while conventional methods for the 

construction of high expressing mammalian cell lines can 
take 6 to 9 months, the present invention allows for 
such clones to be isolated on average after only about 
3-6 months. This is due to the fact that conventionally 
IS isolated clones typically must be subjected to at least 
three rounds of drug resistant gene amplification in 
order to reach satisfactory levels of gene expression. 
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As the homologously produced clones are generated from a 
preselected site which is a high expression site, fewer 
rounds of amplification should be required before reach- 
ing a satisfactory level of production. 
5 Still further, the subject invention enables the 

reproducible selection of high producer clones wherein 
the vector is integrated at low copy number, typically 
single copy. This is advantageous as it enhances the 
stability of the clones and avoids other potential ad- 
10 verse side-effects associated with high copy number. As 
described supra, the subject homologous recombination 
system uses the combination of a "marker plasmid" and a 
"targeting plasmid" which are described in more detail 
below. 

15 The "marker plasmid" which is used to mark and 

identify a transcriptionally hot spot will comprise at 
least the following sequences: 

(i) a region of DNA that is heterologous or unique 
to the genome of the mammalian cell, which functions as 

20 a source of homology, allows for homologous recombina- 
tion (with a DNA contained in a second target plasmid) . 
More specifically, the unique region of DNA (i) will 
generally comprise a bacterial, viral, yeast synthetic, 
or other DNA which is not normally present in the 

25 mammalian cell genome and which further does not 

comprise significant homology or sequence identity to 
DNA contained in the genome of the mammalian cell. 
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Essentially, this sequence should be sufficiently- 
different to mammalian. DNA that it will not signifi- 
cantly recombine with the host cell genome via 
homologous recombination. The size of such unique DNA 
5 will generally be at least about 2 to 10 kilobases in 

size, or higher, more preferably at least about lOkb, as 
several other investigators have noted an increased 
frequency of targeted recombination as the size of the 
homology region is increased (Capecchi, Science, 

10 244:1288-1292 (1989) ) . 

The upper size limit of the unique DNA which acts 
as a site for homologous recombination with a sequence 
in the second target vector is largely dictated by po- 
tential stability constraints (if DNA is too large it 

15 may not be easily integrated into a chromosome and the 
difficulties in working with very large DNAs. 

(ii) a DNA including a fragment of a selectable 
marker DNA, typically an exon of a dominant selectable 
marker gene. The only essential feature of this DNA is 

20 that it not encode a functional selectable marker pro- 
tein unless it is expressed in association with a se- 
quence contained in the target plasmid. Typically, the 
target plasmid will comprise the remaining exons of the 
dominant selectable marker gene (those not comprised in 

25 "targeting" plasmid) . Essentially, a functional 

selectable marker should only be produced if homologous 
recombination occurs (resulting in the association and 
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expression of this marker DNA (i) sequence together with 
the portion (s) of the selectable marker DNA fragment 
which is (are) contained in the target plasmid) . 

As noted, the current invention exemplifies the 
5 use of the neomycin phosphotransferase gene as the domi- 
nant selectable marker which is "split" in the two vec- 
tors. However, other selectable markers should also be 
suitable, e.g., the Salmonella histidinol dehydrogenase 
gene, hygromycin phosphotransferase gene, herpes simplex 

10 virus thymidine kinase gene, adenosine deaminase gene, 
glutamine synthetase gene and hypoxanthine -guanine 
phosphoribosyl transferase gene. 

(iii) a DNA which encodes a functional selectable 
marker protein, which selectable marker is different 

15 from the selectable marker DNA (ii) . This selectable 

marker provides for. the successful selection of mammali- 
an cells wherein the marker plasmid is successfully 
integrated into the cellular DNA. More preferably, it 
is desirable that the marker plasmid comprise two such 

20 dominant selectable marker DNAs, situated at opposite 

ends of the vector. This is advantageous as it enables 
integrants to be selected using different selection 
agents and further enables cells which contain the en- 
tire vector to be selected. Additionally, one marker 

25 can be an amplifiable marker to facilitate gene 

amplification as discussed previously. Any of the 
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dominant selectable marker listed in (ii) can be used as 
well as others generally known in the art. 

Moreover, the marker plasmid may optionally further 
comprise a rare endonuclease restriction site. This is 
5 potentially desirable as this may facilitate cleavage. 

If present, such rare restriction site should be situat- 
ed close to the middle of the unique region that acts as 
a substrate for homologous recombination. Preferably 
such sequence will be at least about 12 nucleotides. 

10 The introduction of a double stranded break by similar 
methodology has been reported to enhance the frequency 
of homologous recombination. (Choulika et al, Mol. 
Cell. Biol., 15:1968-1973 (1995)). However, the 
presence of such sequence is not essential. 

15 The "targeting plasmid" will comprise at least the 

following sequences: 

(1) the same unique region of DNA contained in the 
marker plasmid or one having sufficient homology or 
sequence identity therewith that said DNA is capable of 

20 combining via homologous recombination with the unique 
region (i) in the marker plasmid. Suitable types of 
DNAs are described supra in the description of the 
unique region of DNA (1) in the marker plasmid. 

(2) The remaining exons of the dominant selectable 
25 marker, one exon of which is included as (ii) in the 

marker plasmid listed above. The essential features of 
this DNA fragment is that it result in a functional 
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(selectable) marker protein only if the target plasmid 
integrates via homologous recombination (wherein such 
recombination results in the association of this DNA 
with the other fragment of the selectable marker DNA 
5 contained in the marker plasmid) and further that it 
allow for insertion of a desired exogenous DNA. Typi- 
cally, this DNA will comprise the remaining exons of the 
selectable marker DNA which are separated by an intron. 
For example, this DNA may comprise the first two exons 
10 of the neo gene and the marker plasmid may comprise the 
third exon (back third of neo) . 

(3) The target plasmid will also comprise a de- 
sired DNA, e.g., one encoding a desired polypeptide, 
preferably inserted within the selectable marker DNA 
15 fragment contained in the plasmid. Typically, the DNA 

will be inserted in an intron which is comprised between 
the exons of the selectable marker DNA. This ensures 
that the desired DNA is also integrated if homologous 
recombination of the target plasmid and the marker plas- 
20 mid occurs. This intron may be naturally occurring or 

it may be engineered into the dominant selectable marker 
DNA fragment . 

This DNA will encode any desired protein, 
preferably one having pharmaceutical or other desirable 
25 properties. Most typically the DNA will encode a 

mammalian protein, and in the current examples provided, 
an immunoglobulin or an immunoadhesin . However the 
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invention is not in any way limited to the production of 
immunoglobulins . 

As discussed previously, the subject cloning method 
is suitable for any mammalian cell as it does not re- 
5 quire for efficacy that any specific mammalian sequence 
or sequences be present. In general, such mammalian 
cells will comprise those typically used for protein 
expression, e.g., CHO cells, myeloma cells, COS cells, 
BHK cells, Sp2/0 cells, NIH 3T3 and HeLa cells. In the 
10 examples which follow, CHO cells were utilized. The 

advantages thereof include the availability of suitable 
growth medium, their ability to grow efficiently and to 
high density in culture, and their ability to express 
mammalian proteins such as immunoglobulins in biologi- 
15 cally active form. 

Further, CHO cells were selected in large part 
because of previous usage of such cells by the inventors 
for the expression of immunoglobulins (using the trans- 
lationally impaired dominant selectable marker contain- 
20 ing vectors described previously) . Thus, the present 
laboratory has considerable experience in using such 
cells for expression. However, based on the examples 
which follow, it is reasonable to expect similar results 
will be obtained with other mammalian cells. 
25 In general, transformation or transfection of mam- 

malian cells according to the subject invention will be 
effected according to conventional methods. So that the 
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invention may be better understood, the construction of 
exemplary vectors and their usage in producing inte- 
grants is described in the examples below. 

EXAMPLE 1 

5 Design and Prep aration of Marker 

and Targeting Pl asmid DNA Vectors 

The marker plasmid herein referred to as "Desmond" 

was assembled from the following DNA elements: 

(a) Murine dihvdrof olate reductase gen* fnwFP) , 
10 incorporated into a transcription cassette, comprising 

the mouse beta globin promoter 5" to the DHFR start 
site, and bovine growth hormone poly adenylation signal 
3" to the stop codon. The DHFR transcriptional cassette 
was isolated from TCAE6 , an expression vector created 
15 previously in this laboratory (Newman et al, 1992, Bio- 
technology, 10:1455-1460) . 

(b) E . coli B-qalactosidase g^** - commercially 
available, obtained from Promega as pSV-b-galactosidase 
control vector, catalog # E1081. 

20 (c) Baculovirus DNA. commercially available, pur- 

chased from Clontech as pBAKPAK8, cat # 6145-1. 

(d) Cassette comprising promoter and enhanr^r 
ments from Cytomegalovirus and S V40 virus. The cassette 
was generated by PCR using a derivative of expression 

25 vector TCAE8 (Reff et al, Blood, 83:435-445 (1994)). 

The enhancer cassette was inserted within the baculo- 
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virus sequence, which was first modified by the inser- 
tion of a multiple cloning site. 

(e) E. coli GUS (glucuronidas e) oene . commercially 
available, purchased from Clontech as pBlOl, cat. # 
6017-1. 

(f) Firefly lucif erase aene. commercially avail- 
able, obtained from Promega as pGEM-Luc (catalog # 
E1541) . 

(g) S. tvphimurium histidino l dehydrogenase* ger ^ 
(HisD) . This gene was originally a gift from (Donahue 
et el, Gene, 18:47-59 (1982)), and has subsequently been 
incorporated into a transcription cassette comprising 
the mouse beta globin major promoter 5' to the gene, and 
the SV40 polyadenylation signal 3' to the gene. 

The DNA elements described in (a) - (g) were combined 
into a pBR derived plasmid backbone to produce a 7 . 7kb 
contiguous stretch of DNA referred to in the attached 
figures as "homology" . Homology in this sense refers to 
sequences of DNA which are not part of the mammalian 
genome and are used to promote homologous recombination 
between transfected plasmids sharing the same homology 
DNA sequences . 

(h) Neomycin phosphotransf p rase opne from TNS (Da- 
vis and Smith, Ann. Rev. Micro., 32:469-518 (1978)). 
The complete neo gene was subcloned into pBluescript 
SK- (Stratagene catalog # 212205) to facilitate genetic 
manipulation. A synthetic linker was then inserted into 
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a unique Pstl site occurring across the codons for amino 
acid 51 and 52 of neo. This linker encoded the neces- 
sary DNA elements to create an artificial splice donor 
site, intervening intron and splice acceptor site within 
5 the neo gene, thus creating two separate exons, present- 
ly referred to as neo exon 1 and 2 . Neo exon 1 encodes 
the first 51 amino acids of neo, while exon 2 encodes 
the remaining 203 amino acids plus the stop codon of the 
protein A Notl cloning site was also created within the 

10 intron. 

Neo exon 2 was further subdivided to produce neo 
exons 2 and 3. This was achieved as follows: A set of 
PCR primers were designed to amplify a region of DNA 
encoding neo exon 1, intron and the first 111 2/3 amino 

15 acids of exon2 . The 3' PCR primer resulted in the 

introduction of a new 5* splice site immediately after 
the second nucleotide of the codon for amino acid 111 in 
exon 2, therefore generating a new smaller exon 2. The 
DNA fragment now encoding the original exon 1, intron 

20 and new exon 2 was then subcloned and propagated in a 

pBR based vector. The remainder of the original exon 2 
was used as a template for another round of PCR 
amplification, which generated M exon3". The 5' primer 
for this round of amplification introduced a new splice 

25 acceptor site at the 5* side of the newly created exon 
3, i.e. before the final nucleotide of the codon for 
amino acid 111. The resultant 3 exons of neo encode the 



WO 98/41645 



PCT/US98/03935 



- 29 - 



following information: exon 1 - the first 51 amino acids 
of neo; exon 2 - the next 111 2/3 amino acids, and exon 
3 the final 91 1/3 amino acids plus the translational 
stop codon of the neo gene. 
5 Neo exon 3 was incorporated along with the above 

mentioned DNA elements into the marking plasmid 
"Desmond". Neo exons 1 and 2 were incorporated into the 
targeting plasmid "Molly". The Notl cloning site creat- 
ed within the intron between exons 1 and 2 was used in 

10 subsequent cloning steps to insert genes of interest 
into the targeting plasmid. 

A second targeting plasmid "Mandy" was also 
generated. This plasmid is almost identical to "Molly" 
(some restriction sites on the vector have been changed) 

15 except that the original HisD and DHFR genes contained 
in "Molly" were inactivated. These changes were 
incorporated because the Desmond cell line was no longer 
being cultured in the presence of Histidinol, therefore 
it seemed unnecessary to include a second copy of the 

20 HisD gene. Additionally, the DHFR gene was inactivated 
to ensure that only a single DHFR gene, namely the one 
present in the Desmond marked site, would be amplifiable 
in any resulting cell lines. "Mandy" was derived from 
"Molly" by the following modifications: 

25 (i) A synthetic linker was inserted in the middle 

of the DHFR coding region. This linker created a stop 
codon and shifted the remainder of the DHFR coding 
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region out of frame, therefore rendering the gene 
nonfunctional. 

(ii) A portion of the HisD gene was deleted and 
replaced with a PCR generated HisD fragment lacking the 
5 promoter and start codon of the gene. 

Figure 1 depicts the arrangement of these DNA ele- 
ments in the marker plasmid "Desmond". Figure 2 depicts 
the arrangement of these elements in the first targeting 
plasmid, "Molly". Figure 3 illustrates the possible 

10 arrangement in the CHO genome, of the various DNA 

elements after targeting and integration of Molly DNA 
into Desmond marked CHO cells. Figure 9 depicts the 
targeting plasmid "Mandy." 

Construction of the marking and targeting plasmids 

15 from the above listed DNA elements was carried out fol- 
lowing conventional cloning techniques (see, e.g., 
Molecular Cloning, A Laboratory Manual, J. Sambrook et 
al, 1987, Cold Spring Harbor Laboratory Press, and 
Current Protocols in Molecular Biology, F. M. Ausubel et 

20 al, eds., 1987, John Wiley and Sons). All plasmids were 
propagated and maintained in E. coli XLI blue 
(Stratagene, cat. # 200236). Large scale plasmid 
preparations were prepared using Promega Wizard Maxiprep 
DNA Purification System®, according to the 

25 manufacturers directions. 
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EXAMPLE 2 

Constructi on of a Marked CHO Cell Line 

1. Cell Culture and Transfection Procedures to 
Produced Marked CHO Cell Line 

5 Marker plasmid DNA was linearized by digestion 

overnight at 37°C with Bstll07I. Linearized vector was 
ethanol precipitated and resuspended in sterile TE to a 
concentration of lmg/ml . Linearized vector was intro- 
duced into DHFR-Chinese hamster ovary cells (CHO cells) 

10 DG44 cells (Urlaub et al, Som. Cell and Mol. Gen., 
12:555-566 (1986)) by electroporation as follows. 

Exponentially growing cells were harvested by cen- 
trifugation, washed once in ice cold SBS (sucrose 
buffered solution, 272mM sucrose, 7mM sodium phosphate, 

15 pH 7.4, ImM magnesium chloride) then resuspended in SBS 
to a concentration of 10 7 cells/ml. After a 15 minute 
incubation on ice, 0.4ml of the cell suspension was 
mixed with 4 0/^g linearized DNA in a disposable 
electroporation cuvette. Cells were shocked using a BTX 

20 electrocell manipulator (San Diego, CA) set at 230 

volts, 400 microfaraday capacitance, 13 ohm resistance. 
Shocked cells were then mixed with 20 ml of prewarmed 
CHO growth media {CHO-S-SFMII , Gibco/BRL, catalog # 
31033-012) and plated in 96 well tissue culture plates. 

25 Forty eight hours after electroporation, plates were fed 
with selection media (in the case of transfection with 
Desmond, selection media is CHO-S-SFMII without 
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hypoxanthine or thymidine, supplemented with 2mM 
Histidinol (Sigma catalog # H6647)). Plates were main- 
tained in select ion media for up to 3 0 days , or until 
some of the wells exhibited cell growth. These cells 
were then removed from the 96 well plates and expanded 
ultimately to 120 ml spinner flasks where they were 
maintained in selection media at all times. 

EXAMPLE 3 

Characterization of Marked CHO Cell Lin^ fl 
(a) Southern Analysis 

Genomic DNA was isolated from all stably growing 
Desmond marked CHO cells. DNA was isolated using the 
Invitrogen Easy® DNA kit, according to the manufactur- 
er's directions. Genomic DNA was then digested with 
Hindi I I overnight at 37°C, and subjected to Southern 
analysis using a PCR generated digoxygenin labelled 
probe specific to the DHFR gene. Hybridizations and 
washes were carried out using Boehringer Mannheim's DIG 
easy hyb (catalog # 1603 558) and DIG Wash and Block 
Buffer Set (catalog # 1585 762) according to the manu- 
facturer's directions. DNA samples containing a single 
band hybridizing to the DHFR probe were assumed to be 
Desmond clones arising from a single cell which had 
integrated a single copy of the plasmid. These clones 
were retained for further analysis. Out of a total of 
4 5 HisD resistant cell lines isolated, only 5 were 
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single copy integrants. Figure 4 shows a Southern blot 
containing all 5 of these single copy Desmond clones. 
Clone names are provided in the figure legend, 
(b) Northern Analysis 
5 Total RNA was isolated from all single copy Desmond 

clones using TRIzol reagent (Gibco/BRL cat # 15596-026) 
according to the manufacturer's directions. 10-20^g RNA 
from each clone was analyzed on duplicate formaldehyde 
gels. The resulting blots were probed with PCR 

10 generated digoxygenin labelled DNA probes to (i) DHFR 
message, (ii) HisD message and (iii) CAD message. CAD 
is a trifunctional protein involved in uridine 
biosynthesis (Wahl et al, J. Biol. Chem. , 254, 17:8679- 
8689 (1979) ), and is expressed equally in all cell 

15 types. It is used here as an internal control to help 
guantitate RNA loading. Hybridizations and washes were 
carried out using the above mentioned Boehringer 
Mannheim reagents. The results of the Northern analysis 
are shown in Figure 5. The single copy Desmond clone 

20 exhibiting the highest levels of both the His D and DHFR 
message is clone 15C9, shown in lane 4 in both panels of 
the figure. This clone was designated as the "marked 
cell line" and used in future targeting experiments in 
CHO, examples of which are presented in the following 

25 sections . 
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EXAMPLE 4 

Expression of An ti- CD 20 An^^n^y 
in Desmond Mar ked CHO Cftl .lff 

C2B8, a chimeric antibody which recognizes B-cell 
surface antigen CD20, has been cloned and expressed 
previously in our laboratory. (Reff et al, Blood, 
83:434-45 (1994)). A 4.1 kb DNA fragment comprising the 
C2B8 light and heavy chain genes, along with the neces- 
sary regulatory elements (eukaryotic promoter and poly- 
adenylation signals) was inserted into the artificial 
intron created between exons 1 and 2 of the neo gene 
contained in a pBR derived cloning vector. This newly 
generated 5kb DNA fragment (comprising neo exon 1, C2B8 
and neo exon 2) was excised and used to assemble the 
targeting plasmid Molly. The other DNA elements used in 
the construction of Molly are identical to those used to 
construct the marking plasmid Desmond, identified 
previously. A complete map of Molly is shown in Fig. 2. 

The targeting vector Molly was linearized prior to 
transfection by digestion with Kpnl and Pad, ethanol 
precipitated and resuspended in sterile TE to a concen- 
tration of 1.5mg/mL. Linearized plasmid was introduced 
into exponentially growing Desmond marked cells essen- 
tially as described, except that 80/^g DNA was used in 
each electroporation. Forty eight hours postelectropo- 
ration, 96 well plates were supplemented with selection 
medium - CHO-SSFMII supplemented with 400 Mg/mL Geneti- 
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cin (G418, Gibco/BRL catalog # 10131-019) . Plates were 
maintained in selection medium for up to 30 days, or 
until cell growth occurred in some of the wells. Such 
growth was assumed to be the result of clonal expansion 
of a single G418 resistant cell. The supernatants from 
all G418 resistant wells were assayed for C2B8 pro- 
duction by standard ELISA techniques, and all productive 
clones were eventually expanded to 120mL spinner flasks 
and further analyzed. 

Characterization of Antibody s ecreting Targeted Cells 

A total of 50 electroporations with Molly targeting 
plasmid were carried out in this experiment, each of 
which was plated into separate 96 well plates. A total 
of 10 viable, anti-CD20 antibody secreting clones were 
obtained and expanded to 120ml spinner flasks. Genomic 
DNA was isolated from all clones, and Southern analyses 
were subsequently performed to determine whether the 
clones represented single homologous recombination 
events or whether additional random integrations had 
occurred in the same cells. The methods for DNA isola- 
tion and Southern hybridization were as described in the 
previous section. Genomic DNA was digested with EcoRI 
and probed with a PCR generated digoxygenin labelled 
probe to a segment of the CD20 heavy chain constant 
region. The results of this Southern analysis are pre- 
sented in figure 6. As can be seen in the figure, 8 of 
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the 10 clones show a single band hybridizing to the CD20 
probe, indicating a single homologous recombination 
event has occurred in these cells. Two of the ten, 
clones 24G2 and 28C9, show the presence of additional 
band(s), indicative of an additional random integration 
elsewhere in the genome. 

We examined the expression levels of anti-CD20 
antibody in all ten of these clones, the data for which 
is shown in Table 1, below. 

Table 1: 

Expression Level of Anti-CD20 
Secreting Homologous Integrants 

Clone Anti-CD20 , pg/r/H 



2 0F4 


3.5 


25E1 


2.4 


42F9 


1.8 


39G11 


1.5 


21C7 


1.3 


50G10 


0.9 


29F9 


0.8 


5F9 


0.3 



28C9* 
24G2* 



4.5 
2.1 



WO 98/41645 



PCT/US98/03935 



- 37 - 



* These clones contained additional randomly 
integrated copies of anti-CD20. Expression 
levels of these clones therefore reflect a 
contribution from both the homologous and ran- 
dom sites. 

Expression levels are reported as picogram per cell per 
day (pg/c/d) secreted by the individual clones, and 
represented the mean levels obtained from three separate 
ELISAs on samples taken from 120 mL spinner flasks. 

As can be seen from the data, there is a variation 
in antibody secretion of approximately ten fold between 
the highest and lowest clones. This was somewhat unex- 
pected as we anticipated similar expression levels from 
all clones due to the fact the anti-CD20 genes are all 
integrated into the same Desmond marked site. Neverthe- 
less, this observed range in expression extremely small 
in comparison to that seen using any traditional random 
integration method or with our translationally impaired 
vector system. 

Clone 20F4,. the highest producing single copy inte- 
grant was selected for further study. Table 2 (below) 
presents ELISA and cell culture data from seven day 
production runs of this clone. 
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Table 2: 





7 Day 


Production 


Run Data 


for 20F4 




Day 


% Viable 


Viable/ml 
(x 10 5 ) 


Tx2(hr) 


xng/L 


pg/c/d 


1 


96 


3.4 


31 


1.3 


4.9 


2 


94 


6 


29 


2.5 


3.4 


3 


94 


9-9 


33 


4.7 


3.2 


4 


90 


17.4 


30 


6.8 


3 


5 


73 


14 




8.3 




6 


17 


3.5 




9.5 





10 Clone 20F4 was seeded at 2xl0*ml in a 120ml spinner 

flask on day 0. On the following six days, cell counts 
were taken, doubling times calculated and 1ml samples 
of supernatant removed from the flask and analyzed for 
secreted anti-CD2 0 by ELISA. 

15 This clone is secreting on average, 3-5pg antibody/ - 
cell/day, based on this ELISA data. This is the same 
level as obtained from other high expressing single copy 
clones obtained previously in our laboratory using the 
previously developed translat ionally impaired random 

20 integration vectors. This result indicates the follow- 
ing: 

(1) that the site in the CHO genome marked by the 
Desmond marking vector is highly transcriptionally ac- 
tive, and therefore represents an excellent site from 
25 which to express recombinant proteins, and 
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(2) that targeting by means of homologous recombi- 
nation can be accomplished using the subject vectors and 
occurs at a frequency high enough to make this system a 
viable and desirable alternative to random integration 
methods . 

To further demonstrate the efficacy of this system, 
we have also demonstrated that this site is amplifiable, 
resulting in even higher levels of gene expression and 
protein secretion. Amplification was achieved by plat- 
ing serial dilutions of 20F4 cells, starting at a densi- 
ty of 2.5 x 10 4 cells/ml, in 96 well tissue culture 
dishes, and culturing these cells in media (CHO-SSFMII) 
supplemented with 5, 10, 15 or 20nM methotrexate. Anti- 
body secreting clones were screened using standard ELISA 
techniques, and the highest producing clones were ex- 
panded and further analyzed. A summary of this amplifi- 
cation experiment is presented in Table 3 below. 
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Table 3: 



Summary of 20F4 Amplification 

Expression Level 
# Wells Expression Level # Wells pg/c/d from 
nM MTX Assayed mg/1 96 well Expanded spinner 



10 56 3-13 4 10-15 

15 27 2-14 3 15-18 

20 17 4-11 1 nd 



Methotrexate amplification of 20F4 was set up as de- 
scribed in the text, using the concentrations of metho- 
trexate indicated in the above table. Supernatants 
from all surviving 96 well colonies were assayed by 
ELISA, and the range of anti-CD20 expressed by these 
clones is indicated in column 3. Based on these re- 
sults, the highest producing clones were expanded to 
12 0ml spinners and several ELISAs conducted on the 
spinner supernatants to determine the pg/cell/day ex- 
pression levels, reported in column 5. 



The data here clearly demonstrates that this site can be 
amplified in the presence of methotrexate. Clones from 
the 10 and 15nM amplifications were found to produce on 
the order of 15-20pg/cell/day . 

A 15nM clone, designated 20F4-15A5, was selected as 
the highest expressing cell line. This clone originated 
from a 96 well plate in which only 22 wells grew, and 
was therefore assumed to have arisen from a single cell. 
A 15nM clone, designated 20F4-15A5, was selected as the 
highest expressing cell line. This clone originated 
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from a 96 well plate in which only 22 wells grew, and 
was therefore assumed to have arisen from a single cell. 
The clone was then subjected to a further round of meth- 
otrexate amplification. As described above, serial 
dilutions of the culture were plated into 96 well dishes 
and cultured in CHO-SS-FMII medium supplemented with 
200, 300 or 400nM methotrexate. Surviving clones were 
screened by ELISA, and several high producing clones 
were expanded to spinner cultures and further analyzed. 
A summary of this second amplification experiment is 
presented in Table 4. 

Table 4: 

Summary of 20F4-15A5 Amplif ication 

# Wells Expression Level # Wells Expression Level 
nM MTX Assayed mg/1 96 well Expanded pg/c/d, spinner 



200 


67 


23-70 


1 


50-60 


250 


86 


21-70 


4 


55-60 


300 


81 


15-75 


3 


40-50 



Methotrexate amplifications of 20F4-15A5 were set up 
and assayed as described in the text. The highest 
producing wells, the numbers of which are indicated in 
column 4, were expanded to 120ml spinner flasks. The 
expression levels of the cell lines derived from these 
wells is recorded as pg/c/d in column 5. 

The highest producing clone came from the 250nM metho- 
trexate amplification. The 250nM clone, 20F4-15A5-250A6 
originated from a 96 well plate in which only wells 
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grew, and therefore is assumed to have arisen from a 
single cell. Taken together, the data in Tables 3 and 4 
strongly indicates that two rounds of methotrexate am- 
plification are sufficient to reach expression levels of 
5 60pg/cell/day, which is approaching the maximum secre- 
tion capacity of immunoglobulin in mammalian cells 
(Reff, M.E., Curr. Opin. Biotech., 4:573-576 (1993)). 
The ability to reach this secretion capacity with just 
two amplification steps further enhances the utility of 

10 this homologous recombination system. Typically, random 
integration methods require more than two amplification 
steps to reach this expression level and are generally 
less reliable in terms of the ease of amplification. 
Thus, the homologous system offers a more efficient and 

15 time saving method of achieving high level gene expres- 
sion in mammalian cells. 

EXAMPLE 5 

Expression of Anti-Human CD23 Antibody 
in Desmond Marked CHO Cells 

20 CD23 is low affinity IgE receptor which mediates 

binding of IgE to B and T lymphocytes (Sutton, B.J., and 
Gould, H.J., Nature, 366:421-428 (1993)). Anti-human 
CD23 monoclonal antibody 5E8 is a human gamma- 1 mono- 
clonal antibody recently cloned and expressed in our 

25 laboratory. This antibody is disclosed in commonly 
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assigned Serial No. 08/803,085, filed on February 20, 
1997. 

The heavy and light chain genes of 5E8 were cloned 
into the mammalian expression vector N5KG1 , a derivative 
5 of the vector NEOSPLA (Barnett et al, in Antibody Ex- 
pression and Engineering, H.Y Yang and T. Imanaka, eds . , 
pp27-40 (1995)) and two modifications were then made to 
the genes. We have recently observed somewhat higher 
secretion of immunoglobulin light chains compared to 

10 heavy chains in other expression constructs in the labo- 
ratory (Reff et al, 1997, unpublished observations) . In 
an attempt to compensate for this deficit, we altered 
the 5E8 heavy chain gene by the addition of a stronger 
promoter/enhancer element immediately upstream of the 

15 start site. In subsequent steps, a 2 . 9kb DNA fragment 
comprising the 5E8 modified light and heavy chain genes 
was isolated from the N5KG1 vector and inserted into the 
targeting vector Mandy. Preparation of 5E8 -containing 
Molly and electroporation into Desmond 15C9 CHO. cells 

20 was essentially as described in the preceding section. 

One modification to the previously described proto- 
col was in the type of culture medium used. Desmond 
marked CHO cells were cultured in protein-free CD-CHO 
medium (Gibco-BRL, catalog # AS21206) supplemented with 

25 3mg/L recombinant insulin (3mg/mL stock, Gibco-BRL, 
catalog # AS22057) and 8mM L-glutamine (200mM stock, 
. Gibco-BRL, catalog # 25030-081) . Subsequently, trans- 
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fected cells were selected in the above medium Supple- 
mented with 400jug/mL geneticin. In this experiment, 20 
elect roporat ions were performed and plated into 96 well 
tissue culture dishes. Cells grew and secreted anti- 
CD23 in a total of 68 wells, all of . which were assumed 
to be clones originating from a single G418 cell. 
Twelve of these wells were expanded to 120ml spinner 
flasks for further analysis. We believe the increased 
number of clones isolated in this experiment (68 com- 
pared with 10 for anti-CD20 as described in Example 4) 
is due to a higher cloning efficiency and survival rate 
of cells grown in CD-CHO medium compared with CHO-SS- 
FMII medium. Expression levels for those clones ana- 
lyzed in spinner culture ranged from 0.5-3pg/c/d, in 
close agreement with the levels seen for the anti-CD20 
clones. The highest producing anti-CD23 clone, desig- 
nated 4H12, was subjected to methotrexate amplification 
in order to increase its expression levels. This ampli- 
fication was set up in a manner similar to that describ- 
ed for the anti-CD20 clone in Example 4. Serial dilu- 
tions of exponentially growing 4H12 cells were plated 
into 96 well tissue culture dishes and grown in CD-CHO 
medium supplemented with 3mg/L insulin, 8mM glutamine 
and 30, 3 5 or 4 0nM methotrexate. A summary of this 
amplification experiment is presented in Table 5. 



Table 5: 
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Summary of 2H12 Amplification 

Sxpreasion Level 
# Wells Expression Level # Wells pg/c/d from 
DM MTX Assayed mg/l 96 well Expanded spinner 



30 


100 


6-24 


8 


10-25 


35 


64 


4-27 


2 


10-15 


40 


96 


4-20 


1 


ND 



The highest expressing clone obtained was a 30nM clone, 
isolated from a plate on which 22 wells had grown. 
This clone, designated 4H12-30G5, was reproducibly 
secreting 18-22pg antibody per cell per day. This is 
the same range of expression seen for the first ampli- 
fication of the anti CD20 clone 20F4 (clone 20F4-15A5 
which produced 15-18pg/c/d, as described in Example 4) . 
This data serves to further support the observation 
that amplification at this marked site in CHO is repro- 
ducible and efficient. A second amplification of this 
30nM cell line is currently underway. It is antici- 
pated that saturation levels of expression will be 
achievable for the anti-CD23 antibody in just two am- 
plification steps, as was the case for anti-CD20. 

EXAMPLE 6 

Expression of Immuno adhesin in Desmond Marked CHO Cells 
CTLA-4, a member of the Ig superfamily, is found on 
the surface of T lymphocytes and is thought to play a 
role in antigen-specific T-cell activation (Dariavach et 
al, Eur. J. Immunol., 18:1901-1905 (1988); and Linsley 
et al, J" . Exp. Med., 174:561-569 (1991)). In order to 
further study the precise role of the CTLA-4 molecule in 
the activation pathway, a soluble fusion protein com- 
prising the extracellular domain of CTLA-4 linked to a 
truncated form of the human IgGl constant region was 
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created (Linsley et al ( Id. ) . We have recently 
expressed this CTLA-4 Ig fusion protein in the mammalian 
expression vector BLECH1, a derivative of the plasmid 
NEOSPLA (Barnett et al, in Antibody Expression and Engi- 
neering, H. Y Yang and T. Imanaka, eds., pp27-40 (1995)). 
An 800bp fragment encoding the CTLA-4 Ig was isolated 
from this vector and inserted between the Sac I I and 
Bglll sites in Molly. 

Preparation of CTLA-4 Ig-Molly and electroporation 
into Desmond clone 15C9 CHO cells was performed as de- 
scribed in the previous example relating to anti-CD20. 
Twenty electroporations were carried out, and plated 
into 96 well culture dishes as described previously. 
Eighteen CTLA-4 expressing wells were isolated from the 
96 well plates and carried forward to the 120ml spinner 
stage. Southern analyses on genomic DNA isolated from 
each of these clones were then carried out to determine 
how many of the homologous clones contained additional 
random integrants. Genomic DNA was digested with Bglll 
and probed with a PCR generated digoxygenin labelled 
probe to the human IgGl constant region. The results of 
this analysis indicated that 85% of the CTLA-4 clones 
are homologous integrants only; the remaining 15% con- 
tained one additional random integrant. This result 
corroborates the findings from the expression of anti- 
CD20 discussed above, where 80% of the clones were sin- 
gle homologous integrants. Therefore, we can conclude 
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that this expression system reproducibly yields single 
targeted homologous integrants in at least 80% of all 
clones produced. 

Expression levels for the homologous CT1A4-Ig 
clones ranged from 8-12pg/cell/day . This is somewhat 
higher than the range reported for anti-CD20 antibody 
and anti-CD23 antibody clones discussed above. However, 
we have previously observed that expression of this 
molecule using the intronic insertion vector system also 
resulted in significantly higher expression levels than 
are obtained for immunoglobulins. We are currently 
unable to provide an explanation for this observation. 

EXAMPLE 7 

Targeting Anti- CD20 to an alternate Desmond Marked CHO 
Cell Line 

As we described in a preceding section, we obtained 
5 single copy Desmond marked CHO cell lines (see Figures 
4 and 5) . In order to demonstrate that the success of 
our targeting strategy is not due to some unique proper- 
ty of Desmond clone 15C9 and limited only to this clone, 
we introduced anti-CD20 Molly into Desmond clone 9B2 
(lane 6 in figure 4, lane 1 in figure 5) . Preparation 
of Molly DNA and electroporation into Desmond 9B2 was 
exactly as described in the previous example pertaining 
to anti-CD20. We obtained one homologous integrant from 
this experiment. This clone was expanded to a 120ml 
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spinner flask, where it produced on average 1.2pg anti- 
CD20/cell/day. This is considerably lower expression 
than we observed with Molly targeted into Desmond 15C9. 
However, this was the anticipated result, based on our 
northern analysis of the Desmond clones. As can be seen 
in Figure 5, mRNA levels from clone 9B2 are considerably 
lower than those from 15C9, indicating the site in this 
clone is not as transcriptionally active as that in 
15C9. Therefore, this experiment not only demonstrates 
the reproducibility of the system - presumably any 
marked Desmond site can be targeted with Molly - it also 
confirms the northern data that the site in Desmond 15C9 
is the most transcriptionally active. 

From the foregoing, - it will be appreciated that, 
although specific embodiments of the invention have been 
described herein for purposes of illustration, various 
modifications may be made without diverting from the 
scope of the invention. Accordingly, the invention is 
not limited by the appended claims. 
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WHAT IS CLAIMED IS: 

1. A method for inserting a desired DNA at a 
target site in the genome of a mammalian cell which 
comprises the following steps: 

(i) transfecting or transforming a mammalian cell 
with a first plasmid ("marker plasmid") containing the 
following sequences : 

(a) a region of DNA that is heterologous to 
the mammalian cell genome which when integrated in the 
mammalian cell genome provides a unique site for homolo- 
gous recombination; 

(b) a DNA fragment encoding a portion of a 
first selectable marker protein; and 

(c) at least one other selectable marker DNA 
that provides for selection of mammalian cells which 
have been successfully integrated with the marker plas- 
mid; 

(ii) selecting a cell which contain the marker 
plasmid integrated in its genome; 

(iii) transfecting or transforming said selected 
cell with a second plasmid ("target plasmid") which 
contains the following sequences: 

(a) . a region of DNA that is identical or is 
sufficiently homologous to the unique region in the 
marker plasmid such that this region of DNA can recom- 
bine with said DNA via homologous recombination; 
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(b) a DNA fragment encoding a portion of the 
same selectable marker contained in the marker plasmid, 
wherein the active selectable marker protein encoded by 
said DNA is only produced if said fragment is expressed 
in association with the fragment of said selectable 
marker DNA contained in the marker plasmid; and 

(iv) selecting cells which contain the target plas- 
mid integrated at the target site by screening for the 
expression of the first selectable marker protein. 

2. The method of Claim 1, wherein the DNA frag- 
ment encoding a fragment of a first selectable marker is 
an exon of a dominant selectable marker. 

3. The method of Claim 2, wherein the second 
plasmid contains the remaining exons of said first 
selectable marker. 

4 . The method of Claim 3 , wherein at least one 
DNA encoding a desired protein is inserted between said 
exons of said first selectable marker contained in the 
target plasmid. 

5. The method Claim 4, wherein a DNA encoding a 
dominant selectable marker is further inserted between 
the exons of said first selectable marker contained in 



WO 98/41645 



PCT/US98/03935 



- 51 - 

the target plasmid to provide for co-amplification of 
the DNA encoding the desired protein. 

6. The method of Claim 3, wherein the first domi- 
nant selectable marker is selected from the group con- 
sisting of neomycin phosphotransferase, histidinol dehy- 
drogenase, dihydrofolate reductase, hygromycin phospho- 
transferase, herpes simplex virus thymidine kinase, 
adenosine deaminase, glutamine synthetase, and 
hypoxanthine- guanine phosphoribosyl transferase. 

7. The method of Claim 4, wherein the desired 
protein is a mammalian protein. 

8. The method of Claim 7, wherein the protein is 
an immunoglobulin. 

9. The method of Claim 1, which further comprises 
determining the RNA levels of the selectable marker (c) 
contained in the marker plasmid prior to integration of 
the target vector. 

10. The method of Claim 9, wherein the other 
selectable marker contained in the marker plasmid is a 
dominant selectable marker selected from the group con- 
sisting of histidinol dehydrogenase, herpes simplex 
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thymidine kinase, hydromycin phosphotransferase, adeno- 
sine deaminase and glutamine synthetase. 

11. The method of Claim 1, wherein the mammalian 
cell is selected from the group consisting of Chinese 
hamster ovary (CEO) cells, myeloma cells, baby hamster 
kidney cells, COS cells, NSO cells, HeLa cells and NIH 
3T3 cells. 

12. The method of Claim 11, wherein the cell is a 
CHO cell. 

13. The method of Claim 1, wherein the marker 
plasmid contains the third exon of the neomycin phospho- 
transferase gene and the target plasmid contains the 
first two exons of the neomycin phosphotransferase gene. 

14. The method of Claim 1, wherein the marker 
plasmid further contains a rare restriction endonuclease 
sequence which is inserted within the region of homolo- 
gy. 

15. The method of Claim 1, wherein the unique 
region of DNA that provides for homologous recombination 
is a bacterial DNA, a viral DNA or a synthetic DNA. 
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16. The method of Claim 1, wherein the unique 
region of DNA that provides for homologous recombination 
is at least 300 nucleotides. 

17. The method of Claim 16, wherein the unique 
region of DNA ranges in size from about 300 nucleotides 
to 20 kilobases. 

18. The method of claim 17, wherein the unique 
region of DNA preferably ranges in size from 2 to 10 
kilobases. 

19. The method of Claim 1, wherein the first 
selectable marker DNA is split into at least three 
exons . 

20. The method of Claim 1, wherein the unique 
region of DNA that provides for homologous recombination 
is a bacterial DNA, an insect DNA, a viral DNA or a 
synthetic DNA. 

21. The method of Claim 20, wherein the unique 
region of DNA does not contain any functional genes. 

22. A vector system for inserting a desired DNA at 
a target site in the genome of a mammalian cell which 
comprises at least the following: 
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(i) a first plasmid ("marker plasmid" ) containing 
at least the following sequences: 

(a) a region of DNA that is heterologous to 
the mammalian cell genome which when integrated in the 

5 mammalian cell genome provides a unique site for homolo- 
gous recombination; 

(b) a DNA fragment encoding a portion of a 
first selectable marker protein; and 

(c) at least one other selectable marker DNA 
10 that provides for selection of mammalian cells which 

have been successfully integrated with the marker plas- 
mid; and 

(ii) a second plasmid ("target plasmid") which con- 
tains at least the following sequences: 

15 (a) a region of DNA that is identical or is 

sufficiently homologous to the unique region in the 
marker plasmid such that this region of DNA can recom- 
bine with said DNA via homologous recombination; 

(b) a DNA fragment encoding a portion of the 

20 same selectable marker contained in the marker plasmid, 
wherein the active selectable marker protein encoded by 
said DNA is only produced if said fragment is expressed 
in association with the fragment of said selectable 
marker DNA contained in the marker plasmid. 
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23. The vector system of Claim 22, wherein the DNA 
fragment encoding a fragment of a first selectable mark- 
er is an exon of a dominant selectable marker. 

24 . The vector system of Claim 23 , wherein the 
second plasmid contains the remaining exons of said 
first selectable marker. 

25. The vector system of Claim 24, wherein at 
least one DNA encoding a desired protein is inserted 
between said exons of said first selectable marker con- 
tained in the target plasmid. 

26. The vector system of Claim 24, wherein a DNA 
encoding a dominant selectable marker is further insert- 
ed between the exons of said first selectable marker 
contained in the target plasmid to provide for co-ampli- 
fication of the DNA encoding the desired protein. 

27. The vector system of Claim 24, wherein the 
first dominant selectable marker is selected from the 
group consisting of neomycin phosphotransferase, 
histidinol dehydrogenase, dihydrofolate reductase, 
hygromycin phosphotransferase, herpes simplex virus 
thymidine kinase, adenosine deaminase, glut amine synthe- 
tase, and hypoxan thine -guanine phosphor ibosyl transfer- 



ase . 
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28. The vector system of Claim 25, wherein the 
desired protein is a mammalian protein. 

29. The vector system of Claim 28, wherein the 
protein is an immunoglobulin. 

5 30. The vector system of Claim 22, wherein the 

other selectable marker contained in the marker plasmid 
is a dominant selectable marker selected from the group 
consisting of histidinol dehydrogenase, herpes simplex 
thymidine kinase, hydromycin phosphotransferase, adeno- 
10 sine deaminase and glutamine synthetase. 

31. The vector system of Claim 22, which provides 
for insertion of a desired DNA at a targeted site in the 
genome of a mammalian cell selected from the group con- 
sisting of Chinese hamster ovary (CHO) cells, myeloma 

15 cells, baby hamster kidney cells, COS cells, NSO cells, 
HeLa cells and NIH 3T3 cells. 

32. The vector system of Claim 31, wherein the 
mammalian cell is a CHO cell. 



20 



33. The vector system of Claim 22, wherein the 
marker plasmid contains the third exon of the neomycin 
phosphotransferase gene and the target plasmid contains 
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the first two exons of the neomycin phosphotransferase 
gene . 

34. The vector system of Claim 22, wherein the 
marker plasmid further contains a rare restriction endo- 
nuclease sequence which is inserted within the region of 
homology. 

35. The vector system of Claim 22, wherein the 
unique region of DNA that provides for homologous recom- 
bination is a bacterial DNA, a viral DNA or a synthetic 
DNA. 

36. The vector system of Claim 22, wherein the 
unique region of DNA (a) contained in the marker plasmid 
vector system that provides for homologous recombination 
is at least 300 nucleotides. 

37. The vector system of Claim 36, wherein the 
uniique region of DNA ranges in size from about 300 
nucleotides to 20 kilobases. 

38. The vector system of Claim 37, wherein the 
unique region of DNA preferably ranges in size from 2 to 
10 kilobases. 
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39. The vector system of Claim 22, wherein the 
first selectable marker DNA is split into at least three 
exons . 

40. The vector system of Claim 22, wherein the 
unique region of DNA that provides for homologous recom- 
bination is a bacterial DNA, an insect DNA, a viral DNA 
or a synthetic DNA. 

41. The vector system of Claim 40, wherein the 
unique region of DNA does not contain any functional 
genes . 
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HD = Salmonella HisD Gene 

N3 = Neomycin Phosphotransferase Eacon 3 

D = Murine Dihydrof olate reductase 

E = Cytomegalovirus and SV40 Enhancers 

SA = Splice acceptor 

BT = Mouse Beta Globin Major Promoter 

B = Bovine Growth Hormone Polyadenylation 

S = SV40 Early Polyadenylation 

SV = SV40 Late Polyadenylation 



FIGURE 1A 
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Molly 




D = Dihydrofolate reductase 

Nl = Neomycin Phosphotransferase Exon 1 

N2 = Neomycin Phosphotransferase Eacon 2 

VL = Anti-CD20 Light chain leader + Variable 

K = Human Kappa Constant 

VH = Anti-CD20 Heavy chain Leader + Variable 

Gl = Human Gamma 1 Constant 

HD = Salmonella. Histidinol Dehydrogenase 

E = CMV and SV40 enhancers S « SV40 Origin 

SD = Splice donor SA = Splice acceptor 

C = CM/ promoter /enhancer 

T = HSV TK promoter and Polyoma enhancers 

BT = Mouse Beta Globin Major Promoter 

SV = SV40 Late Polyadenylation 

B = Bovine Growth Hormone Polyadenylation 
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FIGURE 2B 
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Southern Analysis of Anti CD20 
Integrants in Marked CHO Cells 




FIGURE 6 
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Desmond Lark 

10 20 30 40 50 60 

TTTCTAGACC TAGGGCGGCC AGCTAGTAGC TTTGCTTCTC AATTTCTTAT TTGCATAATG 

70 80 90 100 110 Trii }}* FIGURE 7 

ACAAAAAAAG GAAAATTAAT TTTAACACCA ATTCAGTAGT TGATTGAGCA AATGCCTTGC 

130 140 150 160 170 18» 

CAAAAAGGAT GCTTTAGAGA CAGTGTTCTC TGCACAGATA AGGACAAACA TTATTCAGAG 

190 200 210 220 230 240 

GGAGTACCCA GAGCTGAGAC TCCTAAGCCA GTGAGTGGCA CAGCATCCAG GGAGAAATAT 

250 260 270 280 290 300 

GCTTGTCATC ACCGAAGCCT GATTCCGTAG AGCCACACCC TGGTAAGGGC CAATCTGCTC 

310 320 330 340 350 360 

ACACAGGATA GAGAGGGCAG GAGCCAGGGC AGAGCATATA AGGTGAGGTA GGATCAGTTG 

370 380 390 400 410 420 

CTCACAT TTGCTTCTGA CATAGTTGTG TTGGGAGCTT GGATAGCTTG GGGGGGGGAC 

430 440 450 460 470 480 

AGCTCAGGGC TGCGATTTCG CGCCAAACTT GACGGCAATC CTAGCGTGAA GGCTGGTAGG 

490 500 510 520 530 540 

ATTTTATCCC CGCTGCCATC ATGGTTCGAC CATTGAACTG CATCGTCGCC GTGTCCCAAA 

550 560 570 580 590 600 

ATATGGGGAT TGGCAACAAC GGAGACCTAC CCTGGCCTCC GCTCAGGAAC GAGTTCAAGT 

610 620 630 640 650 660 

ACTTCCAAAG AATGACCACA ACCTCTTCAG TGGAAGGTAA ACAGAATCTG GTGATTATGG 

670 680 690 700 710 72* 

GTAGGAAAAC CTGGTTCTCC ATTCCTGACA AGAATCGACC TTTAAAGGAC AGAATTAATA 

730 740 750 760 770 780 

;ttctcag tagagaactc aaagaaccac cacgaggagc tcattttctt gccaaaagtt 

790 800 810 820 830 840 

TGGATGATGC CTTAAGACTT ATTGAACAAC CGGAATTGGC AAGTAAAGTA GACATGGTTT 

850 860 870 880 890 900 

GGATAGTCGG AGGCAGTTCT GTTTACCAGG AAGCCATGAA TCAACCAGGC CACCTCAGAC 

910 920 930 940 950 960 

TCTTTGTGAC AAGGATCATG CAGGAATTTG AAAGTGACAC GTTTTTCCCA GAAATTGATT 

970 980 990 1000 1010 l° 2e 

TG6GGAAATA TAAACTTCTC CCAGAATACC CAGGCGTCCT CTCTGAGGTC CAGGAGCAAA 

1030 1040 1050 1060 1070 Iff? 

AAGGCATCAA GTATAAGTTT GAAGTCTACG AGAAGAAAGA CTAACAGGAA GATGCTTTCA 

1090 1100 1110 1120 1130 114 ® 

ACTTCTCTGC TCCCCTCCTA AAGCTATGCA TTTTTATAAG ACCATGGGAC TTTTGCTGGC 

1150 1160 1170 1180 1190 1200 

TTTAGATCAG CCTCGACTGT GCCTTCTAGT TGCCAGCCAT CTGTTGTTTG CCCCTCCCCC 

1210 1220 1230 1240 1250 1260 

GTGCCTTCCT TGACCCTCGA AGGTGCCACT CCCACTGTCC TTTCCTAATA AAATGAGGAA 

1270 1280 1290 1300 1310 1320 

ATTGCATCGC ATTGTCTGAG TAGGTGTCAT TCTATTCT6G GCG6TGGGGT GGGGCAGGAC 
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1330 1340 1350 1360 1370 1380 

AGCAAGGGG6 AGGATTGGGA AGACAATAGC AGGCATGCTG GGGATGCGGT GGGCTCTATG 

1390 1400 1410 1420 1430 1440 

GCTTCTGAGG CGGAAAGAAC CAGCTGGGGC TCGAAGCGGC CGCCCATTTC GCTGGTCGTC 

1450 1460 1470 1480 1490 1500 

AGATGCG6GA TGGCGTGCGA CGCGGCGGGG AGCGTCACAC TGAGGTTTTC CGCCAGACGC 

1510 1520 1530 1540 1550 1560 

CACTGCTGCC AGGCGCTGAT GTGCCCGGCT TCTGACCATG CGGTCGCGTT CGGTTGCACT 

1570 1580 1590 1600 1610 1620 

ACGCGTACTG TGAGCCAGAG TTGCCCGGCG CTCTCCGGCT GCGGTAGTTC AGGCAGTTCA 

1630 1640 1650 1660 1670 1680 

ATCAACTGTT TACCTTGTGG AGCGACATCC AGAGGCACTT CACCGCTTGC CAGCGGCTTA 

1690 1700 1710 1720 1730 1740 

ATCCAGCG CCACCATCCA GTGCAGGAGC TCGTTATCGC TATGACGGAA CAGGTATTCG 

1750 1760 1770 1780 1790 1800 

CTGGTCACTT CGATGGTTTG CCCGGATAAA CGGAACTGGA AAAACTGCTG CTGGTGTTTT 

1810 1820 1830 1840 1850 1860 

GCTTCCGTCA GCGCTGGATG CGGCGTGCGG TCGGCAAAGA CCAGACCGTT CATACAGAAC 

1870 1880 1890 1900 1910 1920 

TGGCGATCGT TCGGCGTATC GCCAAAATCA CCGCCGTAAG CCGACCACGG CTTGCCGTTT 

1930 1940 1950 1960 1970 1980 

TCATCATATT TAATCAGCGA CTGATCCACC CAGTCCCAGA CGAAGCCGCC CTGTAAACGG 

1990 2000 2010 2020 2030 2040 

GGATACTGAC GAAACGCCTG CCAGTATTTA GCGAAACCGC CAAGACTGTT ACCCATCGCG 

2050 2060 2070 2080 2090 2100 

,GGCGTATT CGCAAAGGAT CAGCGGGCGC GTCTCTCCAG GTAGCGAAAG CCATTTTTTG 

2110 2120 2130 2140 2150 2160 

ATGGACCATT TCGGCACAGC CGGCAAGGGC TGGTCTTCAT CCACGCGCGC GTACATCGGG 

2170 2180 2190 2200 2210 2220 

CAAATAATAT CGGTGGCCCT GGTGTCGGCT CCGCCGCCTT CATACTGCAC CGGCCGGGAA 

2230 2240 2250 2260 2270 2280 

GGATCGACAG ATTTGATCCA GCGATACAGC GCGTCGTGAT TAGCGCCGTG GCCTGATTCA 

2290 2300 2310 2320 2330 2340 

TTCCCCAGCG ACCAGATGAT CACACTCGGG TCATTACGAT CGCGCTGCAC CATTCGCGTT 

2350 2360 2370 2380 2390 2400 

ACGCGTTCGC TCATCGCCGG TAGCCAGCGC G6ATCATCGG TCAGACGATT CATTGGCACC 

2410 2420 2430 2440 2450 2460 

ATGCCGTGGG TTTCAATATT GGCTTCATCC ACCACATACA GGCCGTAGCG GTCGCACAGC 

2470 2480 2490 2500 2510 2520 

GTGTACCACA GCGGATGGTT CGGATAATGC GAACAGCGCA CGGCGTTAAA GTTGTTCTGC 

2530 2540 2550 2560 2570 2580 

TTCATCAGCA GGATATCCTG CACCATCGTC TGCTCATCCA TGACCTGACC ATGCAGAGGA 



2590 2600 2610 2620 2630 2640 
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TGATGCTCGT GACGCTTAAC GCCTCGAATC AGCAACGGCT TGCCGTTCAG CAGCAGCAGA 

2650 2660 2670 2680 2690 2700 

CCATTTTCAA TCCGCACCTC GCGGAAACCG ACATCGCAGG CTTCTGCTTC AATCAGCGTG 

2710 2720 2730 2740 2750 2760 

CCGTCGGCGG TGTGCAGTTC AACCACCGCA CGATAGAGAT TCGGGATTTC GGCGCTCCAC 

2770 2780 2790 2800 2810 2820 

AGTTTCGGGT TTTCGACGTT CAGACGTAGT GTGACGCGAT CGGCATAACC ACCACGCTCA 

2830 2840 2850 2860 2870 2«M 

TCGATAATTT CACCGCCGAA AGGCGCGGTG CCGCTCGCGA CCTGCGTTTC ACCCTCCCAT 

2890 2900 2910 2920 2930 2940 

AAAGAAA CTG TTACCCGTAG GTAGTGACGC AACTCGCCGC ACATCTGAAC TTCAGCCTCC 

2950 2960 2970 2980 2990 3000 

AGTACAGCGC GGCTGAAATC ATCATTAAA6 CGAGTCGCAA CATGGAAATC GCTGATTT6T 

3010 3020 3030 3040 3050 3060 

U lAGTCGGTT TATGCAGCAA CGAGACGTCA CGGAAAATGC CGCTCATCCG CCACATATCC 

3070 3080 3090 3100 3110 312 0 

TGATCTTCCA GATAACTGCC GTCACTCCAG CGCAGCACCA TCACCGCGAG GCGGTTTTCT 

3130 3140 3150 3160 3170 3180 

CCGGCGCGTA AAAATGCGCT CAGGTCAAAT TCAGACGGCA AACGACTGTC CTGGCCGTAA 

3190 3200 3210 3220 3230 3240 

CCGACCCAGC GCCCGTTGCA CCACAGATGA AACGCCGAGT TAACGCCATC AAAAATAATT 

3250 3260 3270 3280 3290 3300 

CGCGTCTGGC CTTCCTGTAG CCAGCTTTCA TCAACATTAA ATGTGAGCGA GTAACAACCC 

3310 3320 3330 3340 3350 3360 

GTCGGATTCT CCGTGGGAAC AAACGGCGGA TTGACCGTAA TGGGATAGGT CACGTTGGTG 

3370 3380 3390 3400 3410 3420 

IAGATGGGCG CATCGTAACC GTGCATCTGC CAGTTTGAGG GGACGACGAC AGTATCGGCC 

3430 3440 3450 3460 3470 3480 

TCAGGAAGAT CGCACTCCAG CCAGCTTTCC GGCACCGCTT CTGGTGCCGG AAACCAGGCA 

3490 3500 3510 3520 3530 3540 

AAGCGCCATT CGCCATTCAG GCTGCGCAAC TGTTGGGAAG GGCGATCGCT GCGGGCCTCT 

3550 3560 3570 3580 3590 3600 

TCGCTATTAC GCCAGCTGGC GAAAGGGGGA TGTGCTGCAA GGCGATTAAG TTGGGTAACG 

3610 3620 3630 3640 3650 3660 

CCAGGGTTTT CCCAGTCACG ACGTTGTAAA ACGACTTAAT CCGTCGAGGG GCTGCCTCGA 

3670 3680 3690 3700 3710 3720 

AGCAAACGAC CTTCCGTTGT GCAGCCAGCG GCGCCTGCGC CGGTGCCCAC AATCGTGCCC 

3730 3740 3750 3760 3770 3780 

GAACAAACTA AACCAGAACA AATTATACCG GCGGCACCGC CGCCACCACC TTCTCCCGTG 

3790 3800 3810 3820 3830 3840 

CCTAACATTC CAGCGCCTCC A CCA CCA CCA CCACCATCGA TGTCTGAATT GCCGCCCGCT 

3850 3860 3870 3880 3890 3900 

CCACCAATGC CGACGGAACC TCAACCCGCT GCACCTTTAG ACCACAGACA ACAATTGTTG 
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, fl1 ft v»?A 3Q30 3940 3950 3560 

OMCtSS! GAAACgS AAATCgKct CGTCTCAGAC CGCCTCTCTT AA6GTAGCTC 

aa*« 1Q8B 3990 4000 4010 **20 

AAACCAAAAA CGGCGCCCGA AACCAGTACA ATAGTTGAGG TGCCGACTGT GTTGCCTAAA 

jaaa AAAA 4050 4060 4070 4080 

GACACATTTG AGCCTaSS GCCGTcSS TCACCGCCAC CACCTCCGCC TCCGCCTCCG 

CCCCCMCK CGCCTGW?? TCCACcSK GTAGATTTAT aTtwSS ACCKcSS 

ccattaSS atttgccgtc tgaaatS! ccaccgSK caccatSct ttctmcSS 

^->ift A77ft 4230 4240 4250 4260 

ttgtctSS? taaaat«!°g cacagttaga ttgaaacccg cccaaaaacg cccgcaatca 

a -j-yft A?*a 4790 4300 4310 4320 

*ataS? caaaaaJS? aactacaaat ttgatcgcgg acgtgttagc cgacacaatt 
aataggJct? gtgtggS? ggcaaaI™ tcttccSS caaotSS cgacgaJggt 

TGGGACGACG ACGATAATCG GCCTAATAAA GCTAACACGC CCGATGTTAA ATATGTCCAA 
Atcirx AA7Q 4480 4490 4500 

gctactaI™ gtacct^St TAAGGGGCGG AGAATGGGCG GAACTGGGCG GAGTTAGGGG 

*cia AK?n 4530 4540 4550 

CGGGATGGGC GGAGTTAGW GCGGGACTAT GGTTGCTGAC TAATTGAGAT GCATGCTTTG 

, C7fl ,r RA asqb 4600 4610 4620 

CATACTTCTG CCTCClSS ACCCTcSS CTTTCCACAC CTGGTTGCTG ACTAATTGAG 

, c , 0 ACjLtx 4658 4660 4670 4680 

'GCAlim TGCATACTTC TGCCTGCTGG GGAGCCTGGG GACTTTCCAC ACCCTAACTG 

AuacJ5?°c cacagaJ™ attcccct" ttattaI™ taatcaatta cggggtcatt 
agttcaKS ccatatI™ agttccS? tacataa"™ acggta1™ gcccgcJXg 

JP1fl 4saa 4840 4850 4860 

ctgaccJS? aacgacSc? gcccatJSc gtcaataatg acgtatgttc ccatagtaac 

ja^a ARftft 4890 4900 4910 *??0 

gccaataIS actttccatt gacgtcIItg cgtggagtat ttacggtaaa ctgcccactt 

iQaa aqaa 4950 4960 4970 4980 

ggcagtJS? caa^? atatgcSSg tacgccccct attgacgtca atgacggtaa 
atggccS?? tggcattatg cccagt^S? gacctt5?« gactttSX cttggJct! 

eacfk CAfiA 5070 5080 5090 5100 

CATCTACCTA TTAGTCATCG CTATTACCAT GGTGATGCGG TTTTGGCAGT ACATCAATG6 

e 11A 513© 5140 5150 5160 

GCGTGGATAG CGGTTrScT CACGG6GATT TCCAAGTCTC CACCCCATTG ACGTCAATGG 

ci7o moo sig0 5200 5210 5220 

GAGTTTGTTT TGAAGCTO CC6GCCAGCT TTATTTAACG TGTTTACGTC GAGTCAATTG 
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5230 5240 5250 5260 5270 5280 

TACACTAACG ACAGTGATCA AAGAAATACA AAAGCGCATA ATATTTTGAA CGACGTfGAA 

5290 5300 5310 5320 5330 5340 

CCTTTATTAC AAAACAAAAC ACAAACGAAT ATCGACAAAG CTAGATTGCT GCTACAAGAT 

5350 5360 5370 5380 5390 5490 

TTGGCAAGTT TTGTGGCGTT GAGCGAAAAT CCATTAGATA GTCCAGCCAT CGGTTCGGAA 

5410 5420 5430 5440 5450 5460 

AAACAACCCT TGTTTGAAAC TAATCGAAAC CTATTTTACA AATCTATTGA GGATTTAATA 

5470 5480 5490 5500 5510 5520 

TTTAAATTCA GATATAAAGA CGCTGAAAAT CATTTGATTT TCGCTCTAAC ATACCACCCT 

5530 5540 5550 5560 5570 5580 

AAAGATTATA AATTTAATGA ATTATTAAAA TACATCAGCA ACTATATATT GATAGACATT 

5590 5600 5610 5620 5630 5640 

. ^AGTTTGT GATATTAGTT TGTGCGTCTC ATTACAATGG CTGTTATTTT TAACAACAAA 

5650 5660 5670 S680 56 96 5700 

CAACTGCTCG CAGACAATAG TATAGAAAAG GGAGGTGAAC TGTTT7TGTT TAACGGTTCG 

5710 5720 5730 5740 5750 5760 

TACAACATTT TGGAAAGTTA TGTTAATCCG GTGCTGCTAA AAAATCGTGT AATTGAACTA 

5770 5780 5790 5800 5810 5820 

GAAGAAGCTG CGTACTATGC CGGCAACATA TTGTACAAAA CC6ACGATCC CAAATTCATT 

5830 5840 5850 5860 5870 5880 

GATTATATAA ATTTAATAAT TAAAGCAACA CACTCCGAAG AACTACCAGA AAATAGCACT 

5890 5900 5910 5920 5930 5940 

GTTGTAAATT ACAGAAAAAC TATGCGCAGC GGTACTATAC ACCCCATTAA AAAAGACATA 



5950 5960 5970 5980 5990 

. . . f ATTTATG ACAACAAAAA ATTTACTCTA TACGATAGAT ACATATATGG ATA CG ATA AT 

6010 6020 6030 6040 6050 6060 

AACTATGTTA ATTTTTATGA GGAGAAAAAT GAA AAA GAGA AGGAATACGA ACAAGAAGAC 

6070 6080 6090 6100 6110 6120 

GACAAGGCGT CTAGTTTATG TGAAAATAAA ATTATATTGT CGCAAATTAA CTGTGAATCA 

6130 6140 6150 6160 6170 6180 

TTTGAAAATG ATTTTAAATA TTACCTCAGC GATTATAACT ACGCGTTTTC AATTATAGAT 

6190 6200 6210 6220 6230 6240 

AATACTACAA ATCTTCTTGT TGCGTTTGGT TTGTATCGTT AATAAAAAAC AAATTTGACA 

6250 6260 6270 6280 6290 6300 

TTTATAATTG TTTTATTATT CAATAATTAC AAATAGGATT GAGACCCTTG CAGTTGCCAG 

6310 6320 6330 6340 6350 6360 

CAAACGGACA GAGCTTGTCG AGGAGAGTTG TTGATTCATT GTTTGCCTCC CTGCTGCGGT 

6370 6380 6390 6400 6410 6420 

TTTTCACCGA AGTTCATGCC AGTCCAGCGT TTTTGCAGCA GAAAAGCCGC CGACTTCGGT 

6430 6440 6450 6460 6470 6480 

TTGCGGTCGC GAGTGAAGAT CCCTTTCTTC TTACCGCCAA CGCGCAATAT GCCTTGCGAG 



6490 6500 6510 6520 6530 6540 
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GTCGCAAAAT CGGCGAAATT CCATACCTGT TCACCGACGA CGGCGCTGAC GCCATCAAAG 

6550 6560 6570 6580 6590 -660° 

ACGCGGTGAT ACATATCCAG CCATGCACAC TGATACTCTT CACTCCACAT GTCCGTGTAC 

6610 6620 6630 6640 6650 6660 

ATTGAGTGCA GCCCGGCTAA CGTATCCACG CCGTATTCGG TGATGATAAT CGGCTGATGC 

6670 6680 6690 6700 6710 6720 

AGTTTCTCCT GCCAGGCCAG AAGTTCTTTT TCCAGTACCT TCTCTGCCGT TTCCAAATCG 

6730 6740 6750 6760 6770 6780 

CCGCTTTGGA CATACCATCC GTAATAACGG TTCAGGCACA GCACATCAAA GAGATCCCTG 

6790 6800 6810 6820 6830 6840 

ATGGTATCGG TGTGAGCGTC GCAGAACATT ACATTGACGC AGCTGATCGG ACGCGTCG66 

6850 6860 6870 6880 6890 6900 

TCGAGTTTAC GCGTTGCTTC CGCCAGTGGC GCGAAATATT CCCGTGCACC TTGCGGACGG 

6910 6920 6930 6940 6959 6960 

GTATCCGGTT CGTTGGCAAT ACTCCACATC ACCACGCTTG GGTCGTTTTT GTCACGCGCT 

6970 6980 6990 7000 7010 7020 

ATCAGCTCTT TAATCGCCTG TAAGTGCGCT TGCTGAGTTT CCCCGTTGAC TGCCTCTTCG 

7030 7040 7050 7060 7070 7080 

CTGTACAGTT CTTTCGGCTT GTTGCCCGCT TCGAAACCAA TGCCTAAAGA GAGGTTAAAG 

7090 7100 7110 7120 7130 7140 

CCGACAGCAG CAGTTTCATC AATCACCACG ATGCCATGTT CATCTGCCCA GTCGAGCATC 

7150 7160 7170 7180 7190 7200 

TCTTCAGCGT AAGGGTAATG CGAGGTACGG TAGGAGTTGG CCCCAATCCA GTCCATTAAT 

7210 7220 7230 7240 7250 _7260 

GCGTGGTCGT GCACCATCAG CACGTTATCG AATCCTTTGC CACGCAAGTC CGCATCTTCA 

7270 7280 7290 7300 7310 7320 

TGACGACCAA AGCCAGTAAA GTAGAACGGT TTGTGGTTAA TCAGGAACTG TTCGCCCTTC 

7330 7340 7350 7360 7370 738 0 

ACTGCCACTG ACCGGATGCC GACGCGAAGC GGGTAGATAT CACACTCTGT CTGGCTTTTG 

7390 7400 7410 7420 7430 7440 

GCTGTGACGC ACAGTTCATA 6AGATAACCT TCACCCGGTT GCCAGAGGTG CGGATTCACC 

7450 7460 7470 7480 7490 7500 

ACTTGCAAAG TCCCGCTAGT GCCTTGTCCA GTTGCAACCA CCTGTTGATC CGCATCACGC 

7510 7520 7530 7540 7550 7560 

AGTTCAACGC TGACATCACC ATTGGCCACC ACCTGCCAGT CAACAGACGC GTGGTTACAG 

7570 7580 7590 7600 7610 7620 

TCTTGCGCCA CATGCGTCAC CACGGTGATA TCGTCCACCC AGGTGTTCGG CGTGGTGTAG 

7630 7640 7650 7660 7670 7680 

ACCATTACGC TGCGATGGAT TCCGGCATAG TTAAAGAAAT CATGGAAGTA AGACTGCTTT 

7690 7700 7710 7720 7730 7740 

TTCTTGCCGT TTTCGTCGGT AATCACCATT CCCGGCGGGA TAGTCTGCCA GTTCAGTTCG 



7750 7760 7770 7780 7799 7800 

TTGTTCACAC AAACGGTCAT ACCCCTCGAC GGATTAAAGA CTTCAAGCGG TCAACTATGA 
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7810 7820 7830 7840 7850 7860 

AGAAGTCTTC GTCTTCGTCC CAGTAAGCTA TGTCTCCAGA ATGTAGCCAT CCATCCTTGT 

7870 7880 7890 7900 7910 7920 

CAATCAAGGC GTTGGTCGCT TCCGGATTGT TTACATAACC GGACATAATC ATAGGTCCTC 

7930 7940 7950 7960 7970 7980 

TGACACATAA TTCGCCTCTC TGATTAACGC CCAGCGTTTT CCCGGTATCC AGATCCACAA 

7990 8000 8010 8020 8030 8040 

CCTTCGCTTC AAAAAATGGA ACAACTTTAC CGACCGCGCC CGGTTTATCA TCCCCCTCG6 

8050 8060 8070 8080 8090 8100 

GTGTAATCAG AATAGCTGAT GTAGTCTCAG TGAGCCCATA TCCTTGTCGT ATCCCT6GAA 

8110 8120 8130 8140 8150 8160 

GATGGAAGCG TTTTGCAACC GCTTCCCCGA CTTCTTTCGA AAGAGGTGCG CCCCCAGAAG 

8170 8180 8190 8200 8210 8220 

\TTTCGTG TAAATTAGAT AAATCGTATT TGTCAATCAG AGTGCTTTTG GCGAA6AATG 

8230 8240 8250 8260 8270 8280 

AAAATAGGGT TGGTACTAGC AACGCACTTT GAATTTTGTA ATCCTGAAGG GATCGTAAAA 

8290 8300 8310 8320 8330 8340 

ACAGCTCTTC TTCAAATCTA TACATTAAGA CGACTCGAAA TCCACATATC AAATATCCGA 

8350 8360 8370 8380 8390 8400 

GTGTAGTAAA CATTCCAAAA CCGTGATGGA ATGGAACAAC ACTTAAAATC GCAGTATCCG 

8410 8420 8430 8440 8450 8460 

GAATGATTTG ATTGCCAAAA ATAGGATCTC TGGCATGCGA GAATCTGACG CAGGCAGTTC 

8470 8480 8490 8500 8510 8520 

TATGCGGAAG GGCCACACCC TTAGGTAACC CAGTAGATCC AGAGGAATTG TTTTGTCACG 

8530 8540 8550 8560 8570 8580 

CAAAGGAC TCTGGTACAA AATCGTATTC ATTAAAACCG GGAGGTAGAT GAGATGTGAC 

8590 8600 8610 8620 8630 8640 

GAACGTGTAC ATCCACTGAA ATCCCTGGTA ATCCGTTTTA GAATCCATGA TAATAATTTT 

8650 8660 8670 8680 8690 8700 

CTGGATTATT GGTAATTTTT TTTGCACGTT CAAAATTTTT TGCAACCCCT TTTTGGAAAC 

8710 8720 8730 8740 8750 8760 

AAACACTACG GTAGGCTGCG AAATGTTCAT ACTGTTGAGC AATTCACGTT CATTATAAAT 

8770 8780 8790 8800 8810 8820 

GTCCTTCGCG GGCGCAACTG CAACTCC6AT AAATAACGCG CCCAACACCG GCATAAAGAA 

8830 8840 8850 8860 8870 8880 

TTGAAGAGAG TTTTCACTCC ATACGACGAT TCTGTGATTT GTATTCAGCC CATATCGTTT 

8890 8900 8910 8920 8930 8940 

CATAGCTTCT GCCAACCGAA CGGACATTTC GAAGTATTCC GCGTACGTGA TGTTCACCTC 

8950 8960 8970 8980 8990 9000 

GATATGTGCA TCTGTAAAAG GAATTGTTCC AGGAACCAGG GCCTATCTCT TCATAGCCTT 

9910 902O 9030 9040 9050 9060 

ATGCAGTTGC TCTCCAGCGG TTCCATCCTC TAGCTTTGCT TCTCAATTTC TTATTTGCAT 



9070 9980 9090 9100 9110 9120 

AATGAGAAAA AAAGGAAAAT TAATTTTAAC ACCAATTCAG TAGTTGATTG AGCAAATGCG 
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9130 9140 9150 9160 9170 JIM 

TTCCCAAAAA GGATGCTTTA GACACAGTGT TCTCTGCACA GATAAGGACA AACATTATTC 

9190 9200 9210 9220 9230 9240 

AGAGGGAGTA CCCAGAGCTG AGACTCCTAA GCCAGTGAGT GCCACACCAT CCAGGGAGAA 

9250 9260 9270 9280 9290 9300 

ATATGCTTGT CATCACCGAA GCCTGATTCC GTAGAGCCAC ACCCTGGTAA GGGCCAATCT 

9310 9320 9330 9340 9350 9360 

GCTCACACAG GATAGAGAGG GCAGGAGCCA GGGCAGAGCA TATAAGGT6A GGTAGGATCA 

9370 9380 9390 9400 9410 9420 

GTTGCTCCTC ACATTTGCTT CTGACATAGT TGTGTTGGGA GCTTGGATCG ATCCACCATG 

9430 9440 9450 9460 9470 9480 

GGCTTCAATA CCCTGATTGA CTGGAACAGC TGTAGCCCTG AACAGCAGCG TGCGCTGCTG 

9490 9500 9510 9520 9530 9540 

k -CGTCCGG CGATTTCCGC CTCTGACAGT ATTACCCGGA CGGTCAGCGA TATTTTGGAT 

9550 9560 9570 9580 9590 9600 

AATGTAAAAA CGCGCCGTGA CGATGCCCTG CGTGAATACA GCGCTAAATT TGATAAAACA 

9610 9620 9630 9640 9650 9660 

GAAGTGACAG CGCTACGCGT CACCCCTGAA GAGATCGCCG CCGCCGGCGC GCGTCTGAGC 

9670 9680 9690 9700 9710 9720 

GACGAATTAA AACAGGCGAT GACCGCTGCC GTCAAAAATA TTGAAACGTT CCATTCCGCG 

9730 9740 9750 9760 9770 *™J 

CAGACGCTAC CGCCTGTAGA TGTGGAAACC CAGCCAGGCG TGCGTTGCCA GCAGGTTACG 

9790 9800 9810 9820 9830 9840 

CGTCCCGTCT CGTCTGTCGG TCTGTATATT CCC6GCGGCT CGGCTCCGCT CTTCTCAACG 

9850 9860 9870 9880 9890 9900 

^ .CTGATGC TGGCGACGCC GGCGCGCATT GCGGGATGCC AGAACGTGGT TCTGTGCTCG 

9910 9920 9930 9940 9950 

CCGCCGCCCA TCGCTGATGA AATCCTCTAT GCGGCGCAAC TGTGTGGCGT GCAGGAAATC 

9970 9980 9990 10000 10*19 J^lll 

TTTAACGTCG GCGGCGCGCA GGCGATTGCC GCTCTGGCCT TCGGCAGCGA GTCCGTACCG 

10030 10040 10050 10060 10070 10M9 

AAAGTGGATA AAA I I I I I GG CCCCGGCAAC GCCTTTGTAA CCGAAGCCAA ACGTCAGGTC 

10090 10100 10110 10120 10130 1911® 

AGCCAGCGTC TCGACGGCGC GGCTATCGAT ATGCCAGCCG GGCCGTCTGA AGTACTGGTG 

10150 10160 10170 10180 10190 10200 

ATCGCAGACA GCGGCGCAAC ACCGGATTTC GTCGCTTCTG ACCTGCTCTC CCAGGCTGAG 

10210 10220 10230 10240 10250 

CACGGCCCGG ATTCCCAGGT GATCCTGCTG ACGCCTGATG CTGACATTGC CCGCAAGGTG 

10270 10280 10290 10300 10310 10320 

GCGCAGGCGG TAGAACGTCA ACTGGCGCAA CTGCCGCGCG CGGACACCGC CCGGCAGGCC 

10330 10340 10350 10360 10370 10380 

CTGAGCGCCA GTCGTCTGAT TGTGACCAAA GATTTAGCGC AGTGCGTCGC CATCTCTAAT 

10390 10400 10410 10420 10430 10440 
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CAGTATGGGC CGGAACACTT AATCATCCAG ACGCGCAATG CGCGCGATTT GGTGGATGCG 

10450 10460 10470 10480 10490 1OS00 

ATTACCAGCG CAGGCTCGCT ATTTCTCGGC GACTGGTCGC CGGAATCCGC CGGTGATTAC 

10510 10520 10530 10540 10550 

GCTTCCGGAA CCAACCATGT TTTACCGACC TATGGCTATA CTGCTACCTG TTCCAGCCTT 

10570 10S80 10590 10600 10610 1062 0 

GGGTTAGCGG ATTTCCAGAA ACGGATGACC GTTCAGGAAC TGTCGAAAGC GGGCTTTTCC 

10630 10640 10650 10660 10670 10680 

GCTCTGGCAT CAACCATTGA AACATTGGCG GCGGCAGAAC GTCTGACCGC CCATAAAAAT 

10690 10700 1O710 10720 10730 10740 

GCCGTGACCC TGCGCGTAAA CGCCCTCAAG GAGCAAGCAT GAGCACTGAA AACACTCTCA 

10750 10760 10770 10780 10790 10800 

GCGTCGCTGA CTTAGCCCGT GAAAATGTCC GCAACCTGGA GATCCAGACA TGGATAAGAT 

10810 10820 10830 10840 10850 10860 

ACATTGATGA GTTTGGACAA ACCACAACTA GAATGCAGTG AAAAAAATGC TTTATTTGTG 

10870 10880 10890 10900 10910 10920 

AAATTTGTGA TGCTATTGCT TTATTTGTAA CCATTATAAG CTGCAATAAA CAAGTTAACA 

10930 10940 10950 10960 10970 1J980 

ACAACAATTG CATTCATTTT ATGTTTCAGG TTCAGGG6GA GGTGTGGGAG GTTTTTTAAA 

10990 11000 11010 11020 11030 11040 

GCAAGTAAAA CCTCTACAAA TGTGGTATGG CTGATTATGA TCTCTAGGGC CGGCCCTCGA 

11050 11060 11070 11080 11090 U100 

CGGCGCGCCT GGCCGCTACT AACTCTCTCC TCCCTCCTTT TTCCTGCAGG CTCAAGGCGC 

11110 11120 11130 11140 11150 11160 

GCATGCCCGA CGGCGAGGAT CTCGTCGTCA CCCATGGCGA TGCCTGCTTG CCGAATATCA 

11170 11180 11190 11200 11210 11220 

TGGTGGAAAA TGGCCGCTTT TCTGCATTCA TCGACTGTGG CCGGCTGGGT GTGGCGGACC 

11230 11240 11250 11260 11270 11280 

GCTATCAGGA CATAGCGTTG GCTACCCGTC ATATTGCTGA AGAGCTTGGC GGCGAATGGG 

11290 11300 11310 11320 11330 11340 

CTGACCGCTT CCTCGTGCTT TACGGTATCG CCGCTCCC6A TTCGCAGCGC ATCGCCTTCT 

11350 11360 11370 11380 11390 ^ 

ATCGCCTTCT TGACGAGTTC TTCTGAGCGG GACTCTGGGG TTCGAAATGA CCGACCAAGC 

11410 U420 11430 U440 11450 ^ 

GACGCCCAAC CTGCCATCAC GAGATTTCGA TTCCACCGCC GCCTTCTATG AAAGGTTGG6 

11470 11480 11490 11500 11510 U520 

CTTCGGAATC GTTTTCCGGG ACGCCGGCTG GATGATCCTC CAGCGCGGGG ATCTCATGCT 

11530 11540 US50 11560 11570 11580 

GGAGTTCTTC GCCCACCCCA ACTTGTTTAT TGCA6CTTAT AATGGTTACA AATAAAGCAA 

11590 11600 11610 11620 11630 U640 

TAGCATCACA AATTTCACAA ATAAAGCATT TTTTTCACTG CATTCTAGTT GTGGTTTGTC 



11650 11660 11670 11680 11690 11700 

CAAACTCATC AATCTATCTT ATCATGTCTG GATCGCGCCC GGTCTCTCTC TAGCCCTAGG 
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1 i 71 0 11720 11730 11740 11750 11760 

TCTA^m GCAGAa"JtA TCCMtKST CCGCCATCTC CAGCAGCCGC ACGCGCCGCA 

1 -rra 1 1 7nci 11790 11800 11810 UJ20 

TCTCGGGCAG CGTTGGGTCC TGGCcSSg TGCGCATGAT CGTGCTCCTG TCGTTGA6GA 

11830 11848 11856 11860 11876 11886 

CCCGGCTAGG CTGGCGGGGT TGCCTTACTG GTTAGCAGAA TGAATCACCG ATACGCGAGC 

n«Qfl 11900 11916 119Z6 11936 11946 

gaacgSKS cgactgSJc tgcaaaacgt ctgcgacctg agcaacaaca tgaatggtct 

11<JS0 11966 11976 11986 11996 12B86 

tcggtttccg tgtttcgtaa agtctggaaa cgcggaactc agcgccctgc accattatgt 

17«1» 12626 12636 1Z646 12656 12666 

tccgga^t? catcgSJS tgctgSSgc taccctgtgg aacacctaca tctgtattaa 

i7«7ft 12080 12090 12100 12110 12320 
CGAAGCG?S GCATTcicCC TGAGTGATTT TTCTCTGGTC CCGCCGCATC CATACCGCCA 

17HB 12146 12150 12166 12176 12186 

GTTGTTTACC CTCACAACGT TCCAGTAACC GGGCATGTTC ATCATCAGTA ACCCGTATCG 

17190 12206 12210 12226 12236 12248 

TGAGCATCCT CTCTCGTTTC ATCGGTATCA TTACCCCCAT GAACAGAAAT CCCCCTTACA 

122S0 12266 12276 12286 12296 12366 

CGGAGGCATC AGTGACCAAA CAGGAAAAAA CCGCCOTAA CATGGCCCGC TTTATCAGAA 

12310 12320 12330 12340 12350 12360 

GCCAGACATT AACGCTTCTG GAGAAACTCA ACGAGCTGGA CGCGGATGAA CAGGCAGACA 

17380 12398 12468 12418 1242 8 

TCTGTGAATC GCTTCACGAC CACGCTGATG AGCTTTACCG CAGCTGCCTC GCGCGTTTCG 

12436 12448 12458 12466 12476 12*80 

GTGATGACGG TGAAAACCTC TGACACATGC AGCTCCCGGA GACGGTCACA GCTTGTCTGT 

12498 12568 12518 12528 12538 12546 

AAGCGGATGC CGGGAGCAGA CAAGCCCGTC AGG6CGCGTC AGCGGGTGTT GGCGGGTGTC 

12556 12568 12578 12588 12598 12666 

G6GGCGCAGC CATGACCCAG TCACGTAGCG ATAGCGGAGT GTATACTGGC TTAACTATGC 

12610 12628 12638 12648 12658 12666 

GGCATCAGAG CAGATTGTAC TGAGAGTGCA CCATATGCGG TGTGAAATAC CGCACAGATG 

12670 12686 12696 12768 12718 12728 

CGTAAGGAGA AAATACCGCA TCAGGCGCTC TTCCGCTTCC TCGCTCACTG ACTCGCTGCG 

12738 12748 12758 12768 12778 12788 

CTCGGTCGTT CGGCTGCGGC GACCGGTATC AGCTCACTCA AAGGCGGTAA TACGGTTATC 

12798 12888 12818 12826 12836 12846 

CACAGAATCA GGGGATAACG CAGGAAAGAA CATGTGAGCA AAAGGCCAGC AAAAGGCCAG 

12856 12866 12878 12888 12898 12988 

6AACCGTAAA AAGGCCGCGT TGCTGGCGTT TTTCCATAGG CTCCGCCCCC CTGACGAGCA 

12918 12928 12938 12948 12958 12968 

TCACAAAAAT CGACGCTCAA GTCAGAGGTG GCGAAACCCG ACAGGACTAT AAAGATACCA 

12978 12988 12998 13688 13818 13828 

GGCGTTTCCC CCTGGAAGCT CCCTCGTGCG CTCTCCTGTT CCGACCCTGC CGCTTACCGG 
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13030 13040 13050 13060 13070 13080 

ATACCTGTCC GCCTTTCTCC CTTCGGGAAG CGTGGCGCTT TCTCATAGCT CACCCTGTAG 

13090 13100 13110 13120 13130 13140 

GTATCTCAGT TCGGTGTAGG TCGTTCGCTC CAAGCTGGGC TGTGTGCACG AACCCCCCGT 

13150 13160 13170 13180 13190 13200 

TCAGCCCGAC CGCTGCGCCT TATCCCGTAA CTATCGTCTT GAGTCCAACC CGGTAA6ACA 

13210 13220 13230 13240 13250 13260 

CGACTTATCG CCACTGGCAG CAGCCACTGG TAACAG6ATT AGCAGAGCGA GGTATGTAG6 

13270 13280 13290 13300 13310 13320 

CGGTGCTACA GAGTTCTTGA AGTGGTGGCC TAACTACGGC TACACTAGAA GGACAGTATT 

13330 13340 13350 13360 13370 13380 

TG6TATCTGC GCTCTGCTGA AGCCA6TTAC CTTCG6AAAA AGAGTTGCTA GCTCTTGATC 

13390 13400 13410 13420 13430 13440 

. ^CAAACAA ACCACCGCTG GTAGCGGTGG TTTTTTTGTT TGCAAGCACC A6ATTACGCG 

13450 13460 13470 1348 0 13490 13500 

CAGAAAAAAA GGATCTCAAG AAGATCCTTT GATCTTTTCT ACGGGGTCTG ACGCTCAGTG 

13510 13520 13530 13540 13550 13560 

GAACGAAAAC TCACGTTAAG GGATTTTGGT CATGAGATTA TCAAAAAGGA TCTTCACCTA 

13570 13580 13590 13600 13610 13620 

GATCCTTTTA AATTAAAAAT GAAGTTTTAA ATCAATCTAA AGTATATATG AGTAAACTTG 

13630 13640 13650 13660 13670 13680 

GTCTGACAGT TACCAATGCT TAATCAGTGA GGCACCTATC TCAGCGATCT GTCTATTTCG 

13690 13700 13710 13720 13730 13740 

TTCATCCATA GTTGCCTGAC TCCCCGTCGT GTACATAACT ACGATACGGG AGGGCTTACC 

13750 13760 13770 13780 13790 13800 

. CTGGCCCC AGTGCTGCAA TGATACCGCG AGACCCACGC TCACCGGCTC CAGATTTATC 

13810 13820 13830 13840 13850 ^^MJ 

AGCAATAAAC CAGCCACCCG GAAGCGCCGA GCGCAGAAGT GGTCCTGCAA CTTTATCCGC 

13870 13880 13890 13900 13910 ^ 

CTCCATCCAG TCTATTAATT GTTGCCCGGA AGCTAGAGTA AGTAGTTCGC CAGTTAATAG 

13930 13940 13950 13960 13970 13980 

TTTGCGCAAC G TT G TTCCCA TTGCTGCAGG CATCGTGGTG TCACGCTCGT CGTTTGGTAT 

13990 14000 14010 14020 14030 

GGCTTCATTC AGCTCCGGTT CCCAACGATC AA6GCGAGTT ACATGATCCC CCATGTTGTG 

14050 14060 14070 14080 14090 

CAAAAAAGCG GTTAGCTCCT TCGGTCCTCC GATCGTTGTC AGAAGTAAGT TGGCCGCAGT 

14110 14120 14130 14140 14150 ^ *£JfJ 

GTTATCACTC ATGGTTATGG CAGCACTGCA TAATTCTCTT ACTGTCATGC CATCCGTAAG 

14170 14180 14190 14200 14210 "g? 

ATGCTTTTCT GTCACTGGTG AGTACTCAAC CAAGTCATTC T GAGA ATA GT GTATGCGGCG 

14230 14240 14250 14260 14270 14280 

ACCGAGTTGC TCTTGCCCCG CGTCAACACG GGATAATACC GCGCCACATA GCAGAACTTT 



14290 14300 14310 14320 14330 14340 
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AAAAGTGCTC ATCATTGGAA AACGTTCTTC GGGGCGAAAA CTCTCAACGA TCTTACCGCT 

14350 14360 14370 14380 14390 1440 9 

GTTGACATCC AGTTCGATGT AACCCACTCG TGCACCCAAC TGATCTTCAG CATCTTTTAC 

14410 14420 14430 14440 14450 14460 

TTTCACCAGC GTTTCTGGGT GAGCAAAAAC AGGAAGGCAA AATGCCGCAA AAAAGGGAAT 



14470 14480 14490 14500 14510 145Z0 

AAGGGCGACA CGGAAATGTT GA A TACT CAT ACTCTTCCTT TTTCAATATT ATTGAAGCAT 

14530 14540 14550 14560 14570 14580 

TTATCAGGGT TATTGTCTCA TGAGCGGATA CATATTTGAA TGTATTTAGA AAAATAAACA 

14590 14600 14610 146Z0 14630 14640 

AATAGGGGTT CCGCGCACAT TTCCCCGAAA AGTGCCACCT GACGTCTAAG AAACCATTAT 

14650 14660 14670 14680 14690 14700 

TATCATGACA TTAACCTATA AAAATAGGCG TATCACGAGG CCCTTTCCTC TTCAAGAA . . 
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10 20 30 40 50 69 

TTAATTAAGG GGCGGAGAAT GGGCGGAACT GGGCGGAGTT AGGGGCGGGA TGGGCG6AGT 

70 80 90 100 UO 120 

TAGGGGCGGG ACTATGGTTG CTG ACTA ATT GAGATGCATG CTTTGCATAC TTCTCCCTGC 

130 140 ISO 160 170 180 

TGGGGAGCCT GCCGACTTTC CACACCTCGT TGCTGACTAA TTGAGATGCA TGCTTTGCAT 

190 200 210 220 230 240 

ACTTCTGCCT GCTGGGGAGC CTGGGGACTT TCCACACCCT AACTGACACA CATTCCACAG 

250 260 270 280 298 388 

AATTAATTCC CCTAGTTATT AATAGTAATC AATTACGGGG TCATTAGTTC ATAGCCCATA 

310 320 330 340 350 360 

TATGGACTTC CGCCTTACAT AACTTACGGT AAATGGCCCG CCTGGCTGAC CGCCCAACGA 

370 380 390 400 410 420 

•.CCGCCCA TTGACCTCAA TAATGACGTA TGTTCCCATA GTAACGCCAA TAG6GACTTT 

430 440 450 460 470 480 

-CCATTGACGT CAATGGGTGG AGTATTTACG GTAAACTGCC CACTTGGCAG TACATCAAGT 

490 SOO 510 520 530 540 

GTATCATATG CCAAGTACGC CCCCTATTGA CGTCAATCAC GGTAAATGGC CCGCCTGGCA 

550 560 570 580 590 600 

TTATGCCCAG TACAT6ACCT TATGGGACTT TCCTACTTGG CAGTACATCT ACGTATTAGT 

610 620 630 640 650 660 

CATCGCTATT ACCATGGTGA TGCGGTTTTG GCAGTACATC AATGGGCGT6 GATAGCGGTT 

670 680 690 700 . 710 720 

TGACTCACGG GGATTTCCAA GTCTCCACCC CATTGACGTC AATGG6AGTT TGTTTTGAAG 

730 740 750 760 770 780 

TGGCCGGC CAGCTTTATT TAACGTGTTT ACGTCGAGTC AATTGTACAC TAACGACAGT 

790 800 810 820 830 840 

GATGAAAGAA ATACAAAAGC GCATAATATT TTGAACGACG TCGAACCTTT ATTACAAAAC 

850 860 870 880 890 J*" 

AAAACACAAA CGAATATCGA CAAAGCTAGA TTGCTGCTAC AAGATTTGCC AAGTTTTGTG 

910 920 930 940 950 960 

GCGTTGAGCG AAAATCCATT AGATAGTCCA GCCATC6GTT CGGAAAAACA ACCCTTGTTT 

970 980 990 1000 WIO I 8 * 8 

GAAACTAATC GAAACCTATT TTACAAATCT ATTGAGGATT TAATATTTAA ATTCAGATAT 

1030 1040 1050 I960 107O lW 

AAAGACGCTG AAAATCATTT GATTTTCGCT CTAACATACC ACCCTAAAGA TTATAAATTT 

1090 1100 1110 1120 1130 

AATGAATTAT TA A A ATA CAT CAGCAACTAT ATATTGATAG ACATTTCCAG TTTGTCATAT 

1150 1160 1170 1180 1190 1200 

TAGTTTGTGC GTCTCATTAC AATGGCTGTT ATTTTTAACA ACAAACAACT GCTCGCAGAC 

1210 1220 1230 1240 1250 1260 

AATAGTATAG AAAAGGGAGG TGAACTGTTT TTGTTTAACG GTTCGTACAA CATTTTGGAA 

1270 1280 1290 1300 1310 1320 

ACTTATGTTA ATCCGGTGCT GCTAAAAAAT GGTGTAATTG AACTAGAAGA AGCTGCGTAC 
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1339 1349 1359 1369 1370 

TATGCCGCCA ACATATTGTA CAAAACCGAC CATCCCAAAT TCATTGATTA TATAAATTTA 

1399 1499 1419 1429 MJJ 

ATAATTAAAG CAACACACTC CGAAGAACTA CCAGAAAATA GCACTGTTGT AAATTACAGA 

1459 1469 1479 14S9 1499 1599 

AAAACTATGC GCAGCGGTAC TATACACCCC ATTAAAAAAG ACATATATAT TTATGACAAC 

1519 1529 1539 1549 155B ^^1569 

AAAAAATTTA CTCTATACGA TAGATACATA TATGGATACG ATAATAACTA T6TTAATTTT 

1579 1589 1599 1699 1619 _ 1620 

TATGAGGAGA AAAAT6AAAA AGAGAAGGAA TACGAAGAAG AAGACGACAA GCCGTCTAGT 

1639 1649 1659 1669 1679 1689 

TTATGTGAAA ATAAAATTAT ATTGTCGCAA ATTAACTGTG AATCATTTGA AAATGAT7TT 

1699 1799 1719 1729 1739 1749 

aaATATTACC tcagcgatta taactacgcg ttttcaatta tagataatac tacaaatgtt 

1759 1769 1779 1789 1799 1899 

CTTGTTGCGT TTGGTTTGTA TCGTTAATAA AAAACAAATT TGACATTTAT AATTGTTTTA 

1819 1829 1839 1849 1859 1869 

TTATTCAATA ATTACAAATA GGATTGAGAC CCTTGCAGTT GCCAGCAAAC GGACA6AGCT 

1879 1889 1899 1999 1919 1929 

TGTCGAGGAG AGTTGTTGAT TCATTGTTTG CCTCCCTGCT CCGGTTTTTC ACCGAAGTTC 

1939 1949 1959 1969 1979 1989 

ATGCCAGTCC AGCGTTTTTG CAGCAGAAAA GCCGCCGACT TCGGTTT6CG GTCGCGAGTG 

1999 2999 2919 2929 2939 2949 

AAGATCCCTT TCTTGTTACC GCCAACGCGC AATATGCCTT GCGAGGTCGC AAAATCGGCG 

2959 2969 2979 2989 2999 2199 

AAATTCCATA CCTGTTCACC GACGACGGCG CTGACGCGAT CAAAGACGCG GTGATACATA 

2119 2129 2139 2149 2159 2169 

TCCAGCCATG CACACTGATA CTCTTCACTC CACATGTCGG TGTACATTGA GTGCAGCCCG 

2179 2189 2199 2299 2219 2229 

GCTAACGTAT CCACGCCGTA TTCGGTGATG ATAATCGGCT GATGCAGTTT CTCCTGCCAG 

2239 2249 2259 2269 2279 2289 

GCCAGAAGTT CTTTTTCCAG TACCTTCTCT GCCGTTTCCA AATCGCCGCT TTGGACATAC 

2299 2399 2319 2329 2339 2349 

CATCCGTAAT AACGGTTCAG GCACACCACA TCAAAGAGAT CGCTGATGGT ATCGGTCTGA 

2359 2369 2379 2389 2399 2499 

GCGTCGCAGA ACATTACATT GACGCAGGTG ATCGGACGCG TCGGGTCGAG TTTACGCGTT 

2419 2429 2439 2449 2459 2469 

GCTTCCGCCA GTGGCGCCAA ATATTCCCGT GCACCTTGCG GACGGGTATC CGGTTCGTTG 

2479 2489 2499 2599 2519 2529 

GCAATACTCC ACATCACCAC GCTTGGGTGG 1 1 1 1 1 GTCAC GCGCTATCAG CTCTTTAATC 

2539 2549 2559 2569 2579 2589 

GCCTGTAAGT GCGCTTGCTG AGTTTCCCCG TTGACTGCCT CTTCGCTGTA CAGTTCTTTC 
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GGCTTGTTGC CCGCTTCGAA ACCAATCCCT AAACAGAGGT TAAAGCC6AC AGCAGCAGTT 

2659 2669 2679 2689 2699 2799 

TCATCAATCA CCACGATGCC ATGTTCATCT GCCCAGTCGA 6CATCTCTTC AGC6TAAG66 

2719 2729 2730 2749 2750 2760 

TAATGCGAGG TACG6TAGGA GTTGGCCCCA ATCCAGTCCA TTAATGCGTG GTCGTGCACC 

2770 2780 2790 Z80O 2810 2820 

ATCAGCACGT TATCGAATCC TTTGCCACGC AAGTCCGCAT CTTCAT6ACG ACCAAAGCCA 

2830 2849 2850 2860 2870 2880 

GTAAAGTAGA ACGGTTTGTG GTTAATCAGG AACTGTTCGC CCTTCACTGC CACTGACCGG 

2890 2900 2910 2920 2939 2948 

ATGCCGACGC GAAGCGGGTA GATATCACAC TCTGTCTGGC TTTTGGCTGT GACGCACAGT 

2950 2960 2970 2980 2999 3999 

"ATAGAGAT AACCTTCACC CGGTTGCCAG AGGTGCGGAT TCACCACTTG CAAAGTCCCG 

3010 3020 3030 3040 3050 3060 

CTAGTGCCTT GTCCAGTTGC AACCACCTGT TGATCCGCAT CACGCAGTTC AACGCTGACA 

3070 3080 3090 3100 3110 3120 

TCACCATTGG CCACCACCTG CCAGTCAACA 6ACGCGTGGT TACAGTCTTG CGCGACATGC 

3130 3140 3150 3160 3170 3180 

6TCACCACGC TGATATC6TC CACCCAGGTG TTCGGCGTGG TGTAGAGCAT TACGCTGCGA 

3190 3200 3210 3220 3230 324 0 

TG6ATTCCG6 CATAGTTAAA GAAATCATGG AAGTAA6ACT GCTTTTTCTT GCCGTTTTCC 

32S0 3260 3270 3280 3290 3300 

TCGGTAATCA CCATTCCCGG CGG6ATAGTC T6CCA6TTCA GTTCGTTGTT CACACAAACG 

3310 3320 3330 3340 3350 3360 

"GATACCCC TCGACGGATT AAACACTTCA A6CGGTCAAC TATGAAGAAG TGTTCGTCTT 

3370 3380 3399 3490 3410 3420 

CGTCCCAGTA AGCTATGTCT CCAGAATGTA GCCATCCATC CTT6TCAATC AAGGCGTTGG 

3430 3440 3450 3460 3470 3480 

TCGCTTCCGG ATTGTTTACA TAACCGGACA TAATCATAGG TCCTCTGACA CATAATTCGC 

3490 3599 3519 3520 3530 3540 

CTCTCTGATT AACGCCCAGC GTTTTCCCGG TATCCAGATC CACAACCTTC GCTTCAAAAA 

3550 3560 3570 3580 3590 3600 

ATGGAACAAC TTTACCGACC GCGCCCG6TT TATCATCCCC CTCGGGTGTA ATCAGAATAG 

3610 36Z0 3630 3640 3650 3660 

CTGATGTAGT CTCAGTGAGC CCATATCCTT GTCGTATCCC TGGAAGATGG AAGCGTTTTG 

3670 3680 3699 3790 3710 

CAACCCCTTC CCCGACTTCT TTCGAAAGAG GTGCGCCCCC A6AAGCAATT TCGTGTAAAT 

3730 3740 3750 3760 3779 3780 

TAGATAAATC GTATTTGTCA ATCAGA6TGC TTTTGGCGAA 6AATGAAAAT AGGGTTGGTA 

3799 3899 3819 3820 3830 3840 

CTAGCAACGC ACTTTGAATT TTGTAATCCT 6AA66GATCG TAAAAACAGC TCTTCTTCAA 

3850 3860 3870 3880 3899 3900 

ATCTATACAT TAAGACGACT CGAAATCCAC ATATCAAATA TCC6AGTGTA GTAAACATTC 
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3910 3920 3930 3940 3950 3960 

CAAAACCGTG ATGGAATGGA ACAACACTTA AAATCGCAGT ATCCGGAATG ATTTGATTGC 

3970 3980 3990 4000 4010 4020 

CAAAAATAGG ATCTCTGGCA TGCGAGAATC TGACGCAGGC AGTTCTATGC GGAAGGGCCA 

4030 4040 4050 4060 4070 4080 

CACCCTTAGG TAACCCAGTA GATCCAGAGG AATTGTTTTG TCACGATCAA AGGACTCTGG 

4090 4100 4110 4120 4130 4140 

TACAAAATCG TATTCATTAA AACCGGGAGG TAGATGAGAT GTGACCAACG TGTACATCGA 

4150 4160 4170 4180 4190 4200 

CTGAAATCCC TGGTAATCCG TTTTAGAATC CATGATAATA ATTTTCTGGA TTATTGGTAA 

4210 4220 4230 4240 4250 4260 

I I I I i I I I GC ACGTTCAAAA TTTTTTGCAA CCCCTT7TTG GAAACAAACA CTACGGTAGG 

4270 4280 4290 4300 4310 4320 

"iCGAAATG TTCATACTGT TGAGCAATTC ACGTTCATTA TAAATGTCGT TCGCGGGCGC 

4330 4340 4350 4360 4370 4380 

AACTGCAACT CCGATAAATA ACGCGCCCAA CACCGGCATA AAGAATTGAA GAGAGTTTTC 

4390 4400 4410 4420 4430 4440 

ACTGCATACG ACGATTCTGT GATTTCTATT CAGCCCATAT CGTTTCATAG CTTCTGCCAA 

4450 4460 4470 4480 4490 _ «O0 

CCGAACGGAC ATTTCGAA6T ATTCCGCGTA CGTGATGTTC ACCTCGATAT GTGCATCTGT 

4510 4520 4530 4540 4550 _ *560 

AAAAGGAATT GTTCCAGGAA CCAGGGCGTA TCTCTTCATA GCCTTATGCA GTTGCTCTCC 

4570 4580 4590 4600 4610 4620 

AGCGGTTCCA TCCTCTAGCT TTGCTTCTCA ATTTCTTATT TCCATAATGA GAAAAAAAGG 

4630 4640 4650 4660 4670 4680 

VATTAATT TTAACACCAA TTCAGTAGTT GATTGAGCAA ATGCGTTGCC AAAAAGGATG 

4690 4700 4710 4720 4730 4740 

CTTTAGAGAC AGTGTTCTCT GCACA6ATAA GGACAAACAT TATTCAGAGG GAGTACCCAG 

4750 4760 4770 4780 4790 4800 

AGCTGAGACT CCTAAGCaG TGAGTGGCAC AGCATCCAGG GAGAAATATG CTTGTCATCA 

4810 4820 4830 4840 4850 4860 

CCGAAGCCTG ATTCCGTAGA GCCACACCCT GGTAAGGGCC AATCTGCTCA CACAGGATAG 

4870 4880 4890 4900 4910 4920 

AGAGGGCAGG AGCCAGGGCA GAGCATATAA GGTGAGGTAG GATCAGTTGC TCCTCACATT 

4930 4940 4950 4960 4970 4980 

TGCTTCTGAC ATAGTTGTGT TGGGAGCTTG GATCGATCCA CCATGGGCTT CAATACCCTG 

4990 5000 5010 5020 5030 5040 

ATTGACTGGA ACAGCTGTAG CCCTGAACAG CAGCGTGCGC TGCTGACGCG TCCGGCGATT 

5050 5060 5070 5080 5090 5100 

TCCGCCTCTG ACAGTATTAC CCGGACGGTC AGCGATATTC TGGATAATGT AAAAACGCGC 

5U0 5120 S130 5140 5150 5160 

GGTGACGATG CCCTGCGTGA. ATACAGCGCT AAATTTGATA AAACAGAAGT GACAGCGCTA 

5170 5180 5190 5200 5210 5220 

CGCGTCACCC CTGAAGAGAT CGCCGCCGCC GGCGCGCGTC TGAGCGACGA ATTAAAACAG 
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5230 5240 5250 5260 5270 52*0 

GCCATGACCG CTGCCGTCAA AAATATTGAA ACGTTCCATT CCCCGCA6AC GCTACCCCCT 

5290 5300 5310 5320 5330 5340 

GTAGAT6TGG AAACCCAGCC AGGCGTGCGT TGCCAGCAGG TTACGCGTCC CGTCTCGTCT 

5350 5360 5370 5380 5390 5400 

GTCGCTCTGT ATATTCCCGG CGGCTCGGCT CCGCTCTTCT CAACGGTGCT GATGCTGGCG 

5410 5420 5430 5440 5450 5460 

ACGCCGGCGC GCATTGCGGG ATGCCAGAAG GTGGTTCTGT GCTCGCCGCC GCCCATCGCT 

5470 5480 5490 5500 5510 5520 

GATGAAATCC TCTATGCGGC GCAACTGTCT GGCGTGCAGG AAATCTTTAA CGTCGGCGGC 

5530 5540 5550 5560 5570 5580 

GCGCAGGCGA TTGCC6CTCT GGCCTTC6GC AGCGAGTCCG TACCGAAAGT GGATAAAATT 

5590 5600 5610 5620 5630 5640 

.TGGCCCCG GCAACGCCTT TGTAACCGAA GCCAAAC6TC AGGTCAGCCA GCGTCTCGAC 

5650 5660 5670 5680 5690 5700 

GGCGCGGCTA TCGATATGCC AGCCGGGCCG TCTGAAGTAC TGGTGATCGC AGACAGCGGC 

5710 5720 5730 5740 5750 5760 

GCAACACCGG ATTTCGTCGC TTCTGACCTG CTCTCCCA6G CTGAGCACGG CCCG6ATTCC 

5770 5780 5790 5800 5810 5820 

CACGTGATCC TGCTGACGCC TGATGCTGAC ATTGCCCGCA A6GTGGCGGA GGCGGTAGAA 

5830 5840 5850 5860 5870 5880 

CGTCAACTGG CGGAACTGCC GCGCGCGGAC ACCGCCCG6C AGGCCCTGA6 CCCCAGTCGT 

5890 5900 5910 5920 5930 5940 

CTGATTGTGA CCAAAGATTT AGCGCAGTGC GTCGCCATCT CTAATCAGTA TGGGCCGGAA 

5950 5960 5970 5980 5990 6000 

JVCTTAATCA TCCA6ACGCG CAATCCGCGC GATTTGGTGG ATGCGATTAC CAGCGCAGCC 

6010 6020 6030 6040 6050 6060 

TCGGTATTTC TCGCCGACTG GTCGCCGGAA TCCGCCGCTG ATTACGCTTC CGGAACCAAC 

6070 6080 6090 6100 6110 6120 

CATGTTTTAC CGACCTATGG CTATACTGCT ACCT6TTCCA GCCTTGGGTT AGCGGATTTC 

6130 6140 6150 6160 6170 6180 

CAGAAACGGA TCACC6TTCA GGAACTGTCG AAAGCGGGCT T7TCCGCTCT GGCATCAACC 

6190 6200 6210 6220 6230 6240 

ATTGAAACAT TGGCGCCGGC AGAACGTCTG ACCGCCCATA AAAATGCCGT GACCCTGCGC 

6250 6260 6270 6280 6290 6300 

GTAAACGCCC TCAA66AGCA AGCATGAGGC ACTGAAAACA CTCTCAGCGT CGCTGACTTA 

6310 6320 6330 6340 6350 „ m JSS!> 

GCCCGTGAAA ATGTCCGCAA CCTGGAGATC CAGACATGAT AAGATACATT GATGAGTTTG 

6370 6380 6390 6400 6410 _ ^420 

GACAAACCAC AACTAGAATG CAGTGAAAAA AATGCTTTAT TTGTGAAATT TGTGATGCTA 

6430 6440 6450 6460 6470 6480 

TTGCTTTATT TGTAACCATT ATAAGCTGCA ATAAACAAGT TAACAACAAC AATTGCATTC 

6490 6500 6510 6520 6530 6540 
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ATTTTATCTT TCA6GTTCAG GGGGAGGTGT GGGAGCTTTT TTAAAGCAA6 TAAAACCTCT 

6550 6560 6570 6580 6590 -6600 

ACAAATGTGG TATGGCTGAT TATGATCTCT AGGGCCGGCC CTCGACGGCG CGCCTCTAGA 

6610 6620 6630 6640 6650 6660 

GCAGTGTGGT TTTGCAAGAG GAAGCAAAAA GCCTCTCCAC CCAGGCCTGG AATGTTTCCA 

6670 6680 6690 6700 6710 67Z0 

CCCAATGTCG AGCACTGTGG TTTTGCAAGA GGAAGCAAAA AGCCTCTCCA CCCAGGCCTG 

6730 6740 6750 6760 6770 6780 

GAATGTTTCC ACCCAATCTC GAGCAAACCC CGCCCAGCGT CTTGTCATTG GC6AATTC6A 

6790 6800 6810 6820 6830 6840 

ACACGCAGAT GCAGTCGGGG CGGCGCGGTC CCAGGTCCAC TTCGCATATT AAGGTGACGC 

6850 6860 6870 6880 6890 6900 

^•GTGGCCTC GAACACCGAG CGACCCTGCA GCCAATATGG GATCGGCCAT TGAACAAGAT 

6910 6920 6930 6940 6950 6960 

GGATTGCACG CAGGTTCTCC GGCCGCTTGG GTGGAGAGGC TATTCGGCTA TGACTGGGCA 

6970 6980 6990 7000 7010 7020 

CAACAGACAA TCGGCTGCTC TGATGCCGCC GTGTTCCGGC TGTCAGCGCA GGGGCGCCCG 

7030 7040 7050 7060 7070 7080 

GTTCTTTTTG TCAAGACCGA CCTGTCCGGT GCCCT6AATG AACTGCAGGT AAGTGCGGCC 

7090 7100 7110 7120 7130 7140 

GTCGATGGCC GAGGCGGCCT CGGCCTCTGC ATAAATAAAA AAAATTAGTC AGCCAT6CAT 

7150 7160 7170 7180 7190 7200 

GGGGCGGAGA ATGGGCGGAA CTGGGCGGAG TTA6GGGCGG GATGGGCGGA GTTAGGGGCG 

7210 7220 7230 7240 7250 7260 

";actatggt tgctgactaa ttgagatgca tgctttgcat acttctgcct gctggggagc 

7270 7280 7290 7300 7310 

CTGGGGACTT TCCACACCTG CTT6CTGACT AATTGAGATG CATGCTTTGC ATACTTCTGC 

7330 7340 7350 7360 7370 7380 

CTGCTGGGGA GCCTGGGGAC TTTCCACACC CTAACTCACA CACATTCCAC AGAATTAATT 

7390 7400 7410 7420 7430 _ 7440 

CCCCTAGTTA TTAATAGTAA TCAATTACGG GGTCATTAGT TCATAGCCCA TATATGGAGT 

7450 7460 7470 7480 7490 7SM 

TCCGCGTTAC ATAACTTACG GTAAATGGCC CGCCT6GCTG ACCGCCCAAC CACCCCCGCC 

7510 7520 7S30 7540 7350 7560 

aTTGACGTC AATAATGACG TATGTTCCCA TAGTAACGCC AATA6GGACT ttccattgac 

7570 7580 7590 7600 7610 7620 

GTCAATGGCT GGACTATTTA CGGTAAACT6 CCCACTTGGC A6TACATCAA GTGTATCATA 

7630 7640 7650 7660 7670 7680 

TGCCAAGTAC GCCCCCTATT GACGTCAATG ACGGTAAATG GCCCGCCTGG CATTATGCCC 

7690 7700 7710 7720 7730 7740 

AGTACATGAC CTTATCG6AC TTTCCTACTT GGCAGTACAT CTACGTATTA 6TCATCGCTA 

7750 7760 7770 7780 7790 7800 

TTACCATGGT GATCCGGTTT TGGCAGTACA TCAATGGGCG TCGATAGCGG TTTGACTCAC 
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7818 7829 7838 7848 MJ ..-..JK 

GCGCATTTCC AAGTCTCCAC CCCATTGACG TCAATGGGAG TTTGTTTTGG CACCAAAATC 

7878 7888 7898 7988 7919 7920 

AACGGGACTT TCCAAAATGT CGTAACAACT CCGCCCCATT GACGCAAATG GGCGGTAGGC 

7938 7948 7958 7968 7978 7988 

GTGTACGGTG GGAGGTCTAT ATAAGCAGA6 CTGGGTACGT GAACCGTCAG ATCGCCTGGA 

7998 8088 8918 88Z8 8838 8848 

GACGCCATCA CAGATCTCTC ACTATGGATT TTCAGGTGCA GATTATCAGC TTCCTGCTAA 

8858 8968 8878 8888 8090 8100 

TCAGTGCTTC AGTCATAATG TCCAGAGGAC AAATTGTTCT CTCCCAGTCT CCAGCAATCC 

8118 8128 8138 8148 8158 8168 

TGTCTGCATC TCCAGGGGAG AAGGTCACAA TGACTTGCA6 GGCCAGCTCA AGTGTAAGTT 

8178 8188 8198 8280 8210 8220 

ATCCACTG GTTCCAGCAG AAGCCAGGAT CCTCCCCCAA ACCCTG6ATT TAT6CCACAT 

8238 8248 8258 8269 8270 8289 

£CAACCTGGC TTCTGGAGTC CCTGTTCGCT TCAGTGGCAG TGGGTCTGGG ACTTCTTACT 

8298 8388 8318 8329 8339 8 f^ 

CTCTCACAAT CAGCAGAGTG GAGGCTGAAG ATGCTGCCAC TTATTACTGC CAGCAGTGGA 

8358 8368 8378 8388 8390 8489 

CTAGTAACCC ACCCACGTTC GGAGGGGGGA CCAAGCTGGA AATCAAACGT ACGGTGGCTG 

8418 8428 8438 8448 8459 8469 

CACCATCTGT CTTCATCTTC CCGCCATCTG ATGAGCAGTT 6AAATCTGGA ACTGCCTCTG 

8478 8489 8499 8588 tmr "*» .. crrc |g! 

TTGTGTGCCT GCTGAATAAC TTCTATCCCA GAGAGGCCAA AGTACAGTGG AAGGTGGATA 

8538 8548 8559 8569 8579 8589 

:gccctcca ATCGGGTAAC TCCCAGGAGA GTGTCACAGA GCAGGACAGC AAGGACAGCA 

8S99 8689 8619 8629 8630 ^.^J 0 * 0 

CCTACAGCCT CAGCAGCACC CTGACGCTGA GCAAAGCAGA CTACGAGAAA CACAAAGTCT 

8658 8668 8678 8688 8699 w 0700 

ACGCCTGCGA AGTCACCCAT CAGGGCCTGA GCTCGCCCGT CACAAAGAGC TTCAACAGGG 

8718 8729 8739 8749 8759 8769 

GAGAGTGTTG AATTCAGATC CGTTAACG6T TACCAACTAC CTAGACTG6A TTCGTGACAA 

8778 8788 8798 8888 8818 8829 

CATGCGGCCG TGATATCTAC GTATGATCAG CCTCGACTGT GCCTTCTAGT TGCCAGCCAT 

8838 8848 8858 8860 8879 8889 

CTGTTGTTTG CCCCTCCCCC GTGCCTTCCT T6ACCCTGGA AGGTGCCACT CCCACTGTCC 

8898 8988 8918 8928 8930 8940 

TTTCCTAATA AAATGAGGAA ATTGCATCGC ATTGTCTGAG TA6GT6TCAT TCTATTCTGG 

8958 8969 8979 8989 8999 9999 

GGGGTGGGGT GGGGCAGGAC AGCAAGGG6G AGGATTGGGA AGACAATAGC AGGCATGCTG 

9918 9828 9838 9848 9950 9969 

GGGATGCGGT G6GCTCTATG GAACCAGCTG GGGCTCGACA GCTATGCCAA GTACGCCCCC 

9978 9888 9898 9199 9110 912.9 

TATTGACGTC AATGACGGTA AATGGCCCGC CTGGCATTAT GCCCAGTACA TGACCTTATG 
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9130 9140 9150 9160 9170 . 9180 

66ACTTTCCT ACTT6CCAGT ACATCTACGT ATTACTCATC GCTATTACCA TGGTCATGCG 

9190 9Z00 9210 9220 9230 9240 

GTTTTGGCAC TACATCAATG GGCCTGGATA GCGGTTTGAC TCACGGGGAT TTCCAAGTCT 

9250 9260 9270 9280 9290 9300 

CCACCCCATT GACGTCAATG GGACTTTCTT TTGGCACCAA AATCAACGGG ACTTTCCAAA 

9310 9320 9330 9340 9350 9360 

ATGTCGTAAC AACTCCGCCC CATTGACGCA AATGGGCGGT AGGCGTG7AC GGTGGGAGGT 

9370 9380 9390 9460 9410 9420 

CTATATAAGC AGAGCTGGGT ACGTCCTCAC ATTCACTGAT CAGCACTGAA CACAGACCCG 

9430 9440 9450 9460 9470 9480 

TCGACATGGG TTGGAGCCTC ATCTTGCTCT TCCTTGTCGC TGTTGCTACG CGTGTCCTGT 

9490 9500 9510 9520 9530 9540 

CCCAGGTACA ACTGCAGCAG CCTGCGGCTG AGCTGGTGAA GCCTGGGGCC TCAGTGAAGA 

9550 9560 9570 9580 9590 9600 

TCTCCTGCAA GGCTTCTGGC TAaCATTTA CCAGTTACAA TAT6CACTGG GTAAAACAGA 

9610 9620 9630 9640 9650 9660 

CACCTGGTCG GGGCCTGGAA TGGATTGGAG CTATTTATCC CGGAAATGGT GATACTTCCT 

9670 9680 9690 9700 9710 9720 

ACAATCAGAA GTTCAAAGGC AAGGCCACAT TGACTGCAGA CAAATCCTCC AGCACAGCCT 

9730 9740 9750 9760 9770 9780 

ACATGCAGCT CAGCAGCCTG ACATCTGAGG ACTCTGCGGT CTATTACTGT GCAAGATCGA 

9790 9800 9810 9820 9830 9840 

CTTACTACGG CGGTGACTGG TACTTCAATG TCTGGGGCGC AGGGACCACG GTCACC6TCT 

9850 9860 9870 9880 9890 9900 

CTCCAGCTAG CACCAAGGGC CCATCGGTCT TCCCCCTGGC ACCCTCCTCC AAGAGCACCT 

9910 9920 9930 9940 9950 9960 

CTGGGGGCAC AGCGGCCCTG GGCTGCCTGG TCAAGGACTA CTTCCCCGAA CCGGTGACGG 

9970 9980 9990 10000 10010 10020 

TGTCGTGGAA CTCAG6CGCC CTGACCAGCG GCGTGCACAC CTTCCCGGCT GTCCTACAGT 

10030 10040 10050 10060 10070 10080 

CCTCAGGACT CTACTCCCTC AGCAGCGTGG TGACCGTGCC CTCCAGCAGC TTGGGCACCC 

10090 10100 10110 10120 19130 10140 

ACACCTACAT CTGCAACGTG AATCACA A GC CCAGCAACAC CAAGGTGGAC AAGAAAGCAG 

10150 10160 10170 10180 10190 10200 

AGCCCAAATC TTGTGACAAA ACTCACACAT GCCCACCGTG CCCAGCACCT GAACTCCTGG 

10210 10220 10230 10240 10250 10260 

GGGGACCGTC AGTCTTCCTC TTCCCCCCAA AACCCAAGGA CACCCTCATG ATCTCCCGGA 

10270 10280 10290 10300 10310 10320 

CCCCTGAGGT CACATGCGTG GTGGTGGACG TGAGCCACGA AGACCCTGAG GTCAAGTTCA 

10330 10340 10350 10360 10370 10380 

ACTGGTACGT GGACGGCGTG GAGGTGCATA ATGCCAAGAC AAAGCCGCGG GAGGAGCAGT 



10390 10400 10410 10420 10430 10440 
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ACAACAGCAC GTACCGTGTG GTCAGCGTCC TCACCGTCCT GCACCAGGAC TGGCTGAAT6 

19450 19460 10479 19480 10*90 10590 

GCAAGGACTA CAAGTGCAAG GTCTCCAACA AAGCCCTCCC AGCCCCCATC GAGAAAACCA 

19519 19529 19539 19549 19559 19569 

TCTCCAAAGC CAAAGG6CAG CCCCGAGAAC CACAGCTGTA CACCCT6CCC CCATCCCGGG 

19579 19589 19599 19699 19619 196Z9 

ATCAGCTGAC CAAGAACCAG GTCAGCCTGA CCTGCCTCGT CAAAGGCTTC TATCCCAGCG 

19639 19640 19659 19669 19679 19689 

ACATCGCCGT GGAGTGGGAG AGCAATGGGC AGCCGGAGAA CAACTACAAG ACaCGCCTC 

19699 19790 19719 19729 19739 19749 

CCGTGCTGGA CTCCGACGGC TCCTTCTTCC TCTACAGCAA GCTCACCGTG GACAAGAGCA 

19759 19769 19779 19789 19799 19899 

'"TTGGCAGCA GG6GAAC6TC TTCTCATGCT CCGTGATGCA TGA6GCTCTG CACAACCACT 

19819 19820 10830 10849 10850 10860 

ACACGCAGAA 6AGCCTCTCC CTCTCTCCGG GTAAATGAGG ATCCGTTAAC GGTTACCAAC 

10870 10880 10890 10900 10910 10920 

TACCTAGACT GGATTCGTGA CAACATGCGG CCGTGATATC TACGTATGAT CAGCCTCGAC 

10930 10949 19959 19969 19979 19980 

TGTCCCTTCT AGTTGCCAGC CATCTGTTGT TT6CCCCTCC CCCGTGCCTT CCTTGACCCT 

19999 11999 11010 11029 11939 11949 

GGAAGGTGCC ACTCCCACTG TCCTTTCCTA ATAAAAT6AG GAAATTGCAT CGCATTGTCT 

11050 U060 11979 11980 11999 11190 

GAGTAGGTGT CATTCTATTC TGGG6GGTGG GGTGGGGCAG GACA6CAAGG GGGAG6ATTG 

11119 11129 11139 11149 11159 11169 

"'iAAGACAAT AGCAGGCATG CTGGGGATGC GGTGGGCTCT ATGCAACCAG CTGGGGCTCG 

11170 U189 11199 11299 11219 11220 

ACAGCAACGC TAGGTCGAGG CCGCTACTAA CTCTCTCCTC CCTCCI I I 1 1 CCTGCAGGAC 

11230 11249 11259 11269 11279 11280 

GAGGCAGCGC GGCTATCGTG GCTGGCCACG ACGG6CGTTC CTTGCGCAGC TGTGCTCGAC 

U290 11300 11310 11320 11330 11340 

GTTGTCACTG AAGCGGGAAG GGACTGGCTG CTATTGGGCG AAGTGCCGGG GCAGGATCTC 

11350 11360 11370 11380 11390 11490 

CTGTCATCTC ACCTTGCTCC TGCCGAGAAA GTATCCATCA TGGCTGATGC AATGCG6CGG 

11410 11420 11430 11440 11450 U460 

CTGCATACGC TTGATCCGGC TACCTGCCCA TTCGACCACC AAGCGAAACA TCGCATC6AG 

11479 11480 11490 11509 11519 11529 

CGAGCACGTA CTCCGAT6GA AGCCGGTCTT GTCGATCA6G AT6ATCTGGA CGAAGAGCAT 

11539 11549 11559 11569 11579 11589 

CAGGCCCTCG CGCCAGCCGA ACTGTTCGCC AGGTAAGTGA GCTCCAATTC AAGCTTCCTA 

11590 11600 11610 11620 11630 11640 

GGGCG6CCAG CTAGTAGCTT TGCTTCTCAA TTTCTTATTT GCATAATGAG AAAAAAAGGA 

11650 11660 11670 11680 11699 11799 

AAATTAATTT TAACACCAAT TCAGTAGTTG ATTGAGCAAA TGCGTTGCCA AAAAGGATGC 
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11710 11720 11730 11740 11750 11750 

TTTAGAGACA GTGTTCTCTG CACAGATAAG GACAAACATT ATTCAGAGG6 AGTACCCACA 



11770 


U780 


11790 


GCTGAGACTC 


CTAAGCCAGT 


GAGTGGCACA 


11830 


U840 


11850 


CGAAGCCTGA 


TTCCGTAGAG 


CCACACCCTG 


11890 


11900 


11910 


GAGGGCAGGA 


GCCAGGGCAG 


AGCATATAAG 


119S0 


U960 


U970 


GCTTCTGACA 


TAGTTGTGTT 


GGGAGCTTGG 


12010 


12020 


12030 


C6ATTTCGCG 


CCAAACTTGA 


CGGCAATCCT 


12070 


12080 


12090 


GCCATCAT 


GGTTCGACCA 


TTGAACTGCA 


12130 


12140 


12150 


.GCAAGAACGG 


AGACCTACCC 


TGGCCTCCGC 


12190 


12200 


12210 


TGACCACAAC 


CTCTTCA GTG 


GAAGGTAAAC 


12250 


12260 


12270 


GGTTCTCCAT 


TCCTGAGAAG 


AATCGACCTT 


12310 


12320 


12330 


GAGAACTCAA 


A GA A CCA CCA 


CGAGGAGCTC 


12370 


12380 


12390 


TAAGACTTAT 


TGAACAACCG 


GAATTCGCAA 


12430 


12440 


12450 


A GTTCTGT 


TTACCAGGAA 


GCCATGAATC 


12490 


12500 


12510 


GGATCATGCA 


GGAATTTGAA 


AGTCACACGT 


12550 


12560 


12570 


AACTTCTCCC 


AGAATACCCA 


GGCGTCCTCT 


12610 


12620 


12630 


ATAAGTTTGA 


AGTCTACGAG 


AAGAAAGACT 


12670 


12680 


12690 


CCCTCCTAAA 


GCTATGCATT 


T7TATAAGAC 


12730 


12740 


12750 


TCGACTGTGC 


CTTCTAGTTG 


CCAGCCATCT 


12790 


12800 


12810 


ACCCTGGAAG 


GTGCCACTCC 


CACTGTCCTT 


12850 


12860 


12870 


TGTCTGAGTA 


GGTGTCATTC 


TATTCTGGGG 


12910 


12920 


12930 


GATTGGGAAG 


ACAATAGCAG 


GCATGCTGGG 


12970 


12980 


12990 


GAAAGAACCA 


GCTGGGGCTC 


GAAGCGGCCG 



11800 11810 11820 

GCATCCAGGG AGAAATAT6C TTGTCATCAC 

U860 11870 11880 

GTAAGGGCCA ATCTGCTCAC ACAGGATAGA 

11920 U930 11940 

GTGAGGTAGG ATCAGTTGCT CCTCACATTT 

11980 11990 12000 

ATAGCTTGGG GGGGGGACAG CTCAG6GCTG 

12040 12050 12060 

AGCGTGAAGG CTGGTAGGAT TTTATCCCCG 

12100 12110 12120 

TCGTCGCCGT GTCCCAAAAT ATGGGGATTG 

12160 12170 12180 

TCAGGAACGA GTTCAAGTAC TTCCAAAGAA 

12220 12230 12240 

AGAATCTGGT GATTATGGGT AGGAAAACCT 

12280 12290 12300 
TAAAGGACAG AATTAATATA GTTCTCAGTA 

12340 12350 12360 
ATTTTCTTGC CAAAAGTTTG GATGATGCCT 

12400 12410 12420 

GTAAAGTAGA CATGGTTTGG ATAGTCGGAG 

12460 12470 12480 

AACCAGGCCA CCTCAGACTC TTTGTGACAA 

12520 12530 12540 

TTTTCCCAGA AATTCATTTG GGGAAATATA 

12580 12590 12600 

CTGAGGTCCA GGAGGAAAAA GGCATCAAGT 

12640 12650 12660 

AACAGGAAGA TGCTTTCAAG TTCTCTGCTC 

12700 12710 12720 

CATGGGACTT TTGCTGGCTT TAGATCAGCC 

12760 12770 12780 

GTTGTTTGCC CCTCCCCCGT GCCTTCCTTG 

12820 12830 12840 

TCCTAATAAA ATGAGGAAAT TGCATCGCAT 

12880 12890 12900 

GGTGGGGTGG GGCAGGACAG CAAGGGGGA6 

12940 12950 12960 

6ATGCGGTGG GCTCTATGGC TTCTGAGGCG 

13000 13010 13020 

CCCATTTCGC TGGTGGTCAG ATGCGGGATG 
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13038 13948 13858 138 68 13878 13080 

GCGT6GGACG CGGCGGG6AG CGTCACACTG AGGTTTTCCG CCAGACGCCA CTGCTGCCAG 

13898 13188 13118 13128 13330 £140 

GCGCTGATGT GCCCGGCTTC TGACCATGCG GTCGCGTTCG GTTGCACTAC GCGTICTGTG 

13 ISO 13168 13178 13188 13198 13288 

AGCCA6AGTT GCCCGGCGCT CTCCGGCTGC GGTAGTTCAG GCA6TTCAAT CAACTGTTTA 

13210 13220 13230 13240 13250 13260 

CCTTGTGGAG CGACATCCAG AGGCACTTCA CCGCTTGCCA GCGGCTTACC ATCa6CGCC 

13270 13280 13290 13300 13310 »»e 

ACCATCCAGT GCAGGAGCTC GTTATCGCTA TGACGGAACA GGTATTCGCT CGTCACTTCG 

13330 13340 13350 13360 1337 0 13380 

ATGGTTTGCC CGGATAAACG 6AACTGGAAA AACTGCTGCT GGTGTTTTGC TTCCGTCAGC 

13390 13400 13410 13420 13430 13440 

c . GGATGCG GCGTGCGGTC GGCAAAGACC AGACCGTTCA TACAGAACTG GC6ATCGTTC 

13450 13460 13470 13480 13490 135M 

GGCGTATCGC CAAAATCACC GCCGTAAGCC GACCACGGGT TGCCGTTTTC ATCATATTTA 

13510 13520 13530 13540 13550 13560 

ATCAGCGACT GATCCACCCA GTCCCAGACG AAGCCGCCCT GTAAACGGGG ATACTGACGA 

13570 13580 13590 13600 13610 13620 

AACGCCTGCC AGTATTTAGC GAAACCGCCA AGACTGTTAC CCATCGCGTG GGCGTATTCG 

13630 13640 13650 13660 13670 

CAAAGGATCA GCGGGCGCGT CTCTCCAGGT AGCGAAAGCC ATTTTTT6AT GGACCATTTC 

13690 13700 13710 13720 13730 13740 

GGCACAGCCG G6AAGGGCTG GTCTTCATCC ACGC6C6C6T ACATCGG6CA AATAATATCG 

13750 13760 13770 13780 13790 13800 

v, .GGCCGTGG TGTCGGCTCC GCCGCCTTCA TACTGCACCG GGCGGGAA6G ATCGACAGAT 

13810 13820 13830 13840 13850 13860 

TTCATCCAGC GATACAGCGC GTCGTGATTA GCGCCGTCGC CTGATTCATT CCCCA6CGAC 

13870 13880 13890 13900 13910 ^^0 

CAGATGATCA CACTCGGGTG ATTACGATCG CGCTGCACCA TTCGCGTTAC GCGTTCGCTC 

13930 13940 13950 13960 13970 13W 

ATCGCCGGTA GCCAGCGCGG ATCATCGGTC AGACGATTCA TTGGCACCAT GCCGT6GGTT 

13990 14900 14010 14020 14030 14040 

TCAATATTGG CTTCATCCAC CACATACAGG CCGTAGCGGT CGCACAGCGT GTACCACAGC 

14050 14060 14070 14080 14090 _ 

GGATCGTTCG GATAATGCGA ACAGCGCACG GCGTTAAAGT TGTTCTGCTT CATCAGCAGG 

14110 14120 14130 14140 14150 14168 

ATATCCTGCA CCATCGTCTG CTCATCCATG ACCTGACCAT GCA6AGGATG AT6CTCGTGA 

14170 14180 14190 14200 14210 14220 

CGGTTAACGC CTCGAATCAG CAACGGCTTG CCGTTCAGCA GCAGCAGACC ATTTTCAATC 

14230 14240 14250 14260 14270 14280 

CGCACCTCGC GGAAACCGAC ATCGCAGGCT TCTGCTTCAA TCAGCGTGCC GTCGGCGGTG 



14290 14300 14310 14320 14330 14340 
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TGCAGTTCAA CCACCGCACG ATAGAGATTC GG6ATTTCGG CGCTCCACAG TTTCGGCTTT 

14350 14360 14378 14380 1*390 

TCGACGTTCA GACGTAGTGT 6ACGCGATCG GCATAACCAC CACGCTCATC GATAATTTCA 

14410 14420 14430 14440 14450 14460 

CCGCCGAAAG GCGCGGTGCC GCTGGCGACC TGCGTTTCAC CCTGCCATAA AGAAACTGTT 

14470 14480 14490 14500 14510 1*529 

ACCCGTAGGT AGTCACGCAA CTCGCC6CAC ATCT6AACTT CAGCCTCCAG TACA6CGCGG 

14S30 1*5*0 1*550 1*560 1*?7» .-^SK 

CTGAAATCAT CATTAAA6C6 AGTGGCAACA TG6AAATCGC TGATTTGTGT ACTCCCTTTA 

14S90 1*600 1*610 1*620 1*630 1*6*0 

T6CAGCAACG AGACGTCACG GAAAATGCCG CTCATCCGCC ACATATCCTG ATCTTCCAGA 

14650 14660 14670 1*680 !*«"> "TOO 

TAACTGCCGT CACTCCAGCG CAGCACCATC ACCGCGA6GC GGTTTTCTCC GGCGCGTAAA 

14710 14720 14730 147*0 1*750 1*760 

AATGCGCTCA GGTCAAATTC AGACGGCAAA C6ACTGTCCT GGCCGTAACC GACCCAGCGC 

14770 14780 14790 14800 1*810 I*"* 

CCGTTGCACC ACAGATGAAA CGCCGAGTTA ACGCCATCAA AAATAATTCG CGTCTGGCCT 

14830 14840 14850 14860 14870 1*880 

TCCTGTAGCC AGCTTTCATC AACATTAAAT GTGAGCGAGT AACAACCCGT CGGATTCTCC 

14890 14900 14910 14920 

GTCG6AACAA ACGGCG6ATT GACCGTAATG 6GATAG6T6A CGTT6GTGTA GATGGGCGCA 

14950 14960 14970 14980 ^ 1*»« . rr -."?S 

TCGTAACCGT GCATCTGCCA GTTTGA666G ACGACGACAG TATCGGCCTC AGGAAGATCG 

1S010 15020 15030 15640 15050 1 52S? 

'•^CTCCAGCC AGCTTTCCGG CACCGCTTCT G6TGCCGGAA ACCAGGCAAA GCGCCATTC6 

15070 15080 15090 15100 13VL9 1S120 

CCATTCA66C TGCGCAACTG TTGG6AAGGG CGATCGGTGC GGGCCTCTTC GCTATTACGC 

15130 15140 15150 15160 15170 151X 9 

CAGCTGGCGA AAGGGGGATG TGCTGCAAGG CGATTAAGTT GGGTAACGCC A6GGTTTTCC 

15190 15200 15210 15220 15230 152*8 

CAGTCACGAC GTTGTAAAAC GACTTAATCC GTCGAGG6GC TGCCTCGAAG CAGACGACCT 

15250 1S260 15270 15280 15290 15300 

TCCGTTGTGC AGCCAGCGGC GCCTGCGCCG GT6CCCACAA TCGT6CGCGA ACAAACTAAA 

1S310 15320 15330 153*0 1S3S0 15360 

CCAGAACAAA TTATACCGGC G6CACCGCCG CCACCACCTT CTCCCGT6CC TAACATTCCA 

15370 15380 15398 15*00 15*10 15*20 

6CGCCTCCAC CACCACCACC ACCATC6ATG TCTGAATTGC CGCCCGCTCC ACCAATGCC6 

15*30 15440 15450 15460 15470 

ACGGAACCTC AACCCGCTGC ACCTTTAGAC GACAGACAAC AATTGTTGGA AGCTATTAGA 

15490 15500 15510 15520 15530 15548 

AACGAAAAAA ATCGCACTCG TCTCAGACCG GTCAAACCAA AAACGGCGCC CGAAACCAGT 

15558 15568 15578 15580 15590 15600 

ACAATAGTTG AGGTGCCGAC TGTGTTGCCT AAA GAGA CAT TTGAGCCTAA ACCGCCGTCT 
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.caSS ««SS5 cccc^c «crcSSS 

ATCGTAGATT nmSS TCUcSS! ««g3 T.«lSSS «C«2IS 

rucciS! cnaSK acmSS! aarSSS -nJSS ««.S3? 
^SZ cccccSS .cccSSS t««SS maiSJS aa.S!X 

«mS!S «»3S? «»£S GTCGTCTGCC U^SS 

TCTCSS UtC S! »«S «1<25 «C«cS£ TC«cSTt 

«cS2 «cc«3S taut^c aucSS n«S 
«n££ ctccSTt aaJSS c^cSS? «.rciSH a«£5 
kctcSS'c emJSS «<£iS -C&S 

«n££ crac^S ««£S cc« T ilHS .««5SS uocJSS 
xocrcSS! c«c&£ a«iS5 .arJSS tcttcSSS c«t£& 
t^cSSS <«J£g TcacSS! mtcgS ckcSS 

„»« mci> »,« ^gjs. >ut «■ ^jgs aOT i«S 

penSS tcbSS ™« cccr&S kcoSS 

«m2s ««as?; t ™? ^5^? 

™S TonSS otcaIuoa „rcciS? ««SS .k«555 
•uaSS ^ccS'c? ™fc acnffiS cacSS'c «»SS? 
cnJSS iau2S .uaSSt .«tc£2 «wS« 

accJIS <c«S% crcccS tc«t2?« «"SS 

cTcroiSI! ««««™ «««3SR ««««« 

A «alS3 ckmSS muSS «tt««SS «ih25 ««SS 

C c„Ti!s; ™i? «ct^ t««ss: ««a2S 
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16930 16940 16956 16960 16970 .16980 

TACTGAGAGT GCACCATATG CGGTGTCAAA TACCGCACAG ATGCGTAACG AGAAAATACC 

16990 17000 17010 17020 17030 17040 

GCATCAGGCG CTCTTCCGCT TCCTCGCTCA CTGACTCGCT CCGCTCGGTC GTTCCGCTGC 

17050 17060 17070 17080 17090 17100 

GGCGAGCGGT ATCAGCTCAC TCAAAGGCGG TAATACGGTT ATCCACAGAA TCAGG6GATA 

17110 17120 17130 17140 17150 17160 

ACGCAGGAAA GAACATGTGA GCAAAAGGCC AGCAAAAGGC CAGGAACCGT AAAAAGGCCG 

17170 17180 17190 17200 17210 17220 

CGTTCCTGGC GTTTTTCCAT AGGCTCCGCC CCCCT6ACGA GCATCACAAA AATCGACGCT 

17230 17240 17250 17260 17270 17280 

CAAGTCAGAG GTGGCGAAAC CCGACAGGAC TATAAAGATA CCAGGCGTTT CCCCCTGGAA 

17290 17300 17310 17320 17330 17340 

GCTCCCTCGT GCGCTCTCCT GTTCCGACCC T6CCGCTTAC CGGATACCT6 TCCGCCTTTC 

17350 17360 17370 17380 17390 17400 

"TCCCTTCGGG AAGCGTGGCG CTTTCTCATA GCTCACGCTG TAGGTATCTC AGTTCGGTGT 

17410 17420 17430 17440 17450 17460 

AGGTCGTTCG CTCCAAGCTG GGCTGTGTGC ACGAACCCCC CGTTCAGCCC GACCGCTGCG 

17470 17480 17490 17500 17510 17S20 

CCTTATCCGG TAACTATCGT CTTGAGTCCA ACCCGGTAAG ACACGACTTA TCGCCACTG6 

17530 17540 17550 17560 17570 17580 

CACCACCCAC TGGTAACAGG ATTAGCAGAG CGAGGTATGT AGGCGGTGCT ACAGAGTTCT 

17590 17600 17610 17620 17630 17640 

TGAAGTGGTG GCCTAACTAC GGCTACACTA GAAGGACAGT ATTTGGTATC TGCGCTCT6C 

17650 17660 17670 17680 17690 17700 

TGAAGCCAGT TACCTTCGGA AAAAGAGTTG GTAGCTCTTG ATCCGGCAAA CAAACCACCG 

17710 17720 17730 17740 17750 17760 

CTGGTAGCGG TGCTTTTTTT GTTTGCAAGC AGCAGATTAC GCGCAGAAAA AAAGGATCTC 

17770 17780 17790 17800 17810 17820 

AAGAA6ATCC TTTGATCTTT TCTACGGGGT CTGACGCTCA GTGGAACGAA AACTCACGTT 

1 7830 17840 17850 17860 17870 17880 

AAGGGATTTT GGTCATGAGA TTATCAAAAA 6GATCTTCAC CTAGATCCTT TTAAATTAAA 

17890 17900 17910 17920 17930 17940 

AATGAAGTTT TAAATCAATC TAAAGTATAT ATGAGTAAAC TTGGTCTGAC ACTTACCAAT 

17950 17960 17970 17980 17990 18000 

GCTTAATCAG TGAGGCACCT ATCTCAGCGA TCTGTCTATT TCGTTCATCC ATAGTT6CCT 

18010 18020 18030 18040 18050 18060 

GACTCCCCGT CGTGTAGATA ACTACGATAC GGGAGGGCTT ACCATCTGGC CCCAGTGCTG 

18070 18080 18090 18100 18110 18120 

CAATCATACC GCGAGACCCA CGCTCACCGG CTCCA6ATTT ATCAGCAATA AACCAGCCAG 

18130 18140 18150 18160 18170 18180 

CCGGAAGGGC CGAGCGCAGA AGTGGTCCTG CAACTTTATC CGCCTCCATC CAGTCTATTA 

18190 18200 18210 18220 18230 18240 
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ATTGTTGCCG 


GGAAGCTAGA 


GTAAGTAGTT 


CGCCAGTTAA 


TAGTTTGCGC 


AACGTTGTT6 


18250 


18260 


18770 


18280 

1QXOV 


18290 


18300 


CCATTGCTGC 


AGGCATCGTG 


CTGTCACGCT 


CGTCGTTTGG 


TAT6GCTTCA 


TTCAGCTCCG 


18310 


18320 


18330 


18340 


18350 


18360 


GTTCCCAACG 


ATCAAGGCGA 


GTTACATGAT 


CCCCCATGTT 


GTGCAAAAAA 


GCGGTTAGCT 


18370 


1838B 


18 300 


18400 


18410 


18420 


CCTTCGGTCC 


TCCGATCGTT 


GTCAGAAGTA 


AGTTGGCCGC 


AGTGTTATCA 


CTCATGGTTA 


18430 


18440 


1 8A50 


18460 


18470 


18480 


TGGCAGCACT 


GCATAATTCT 


CTTACTGTCA 


TGCCATCCGT 


AAGATGCTTT 


TCTGTGACTG 


18490 


18500 


18510 


18520 


18530 


18540 


GTGAGTACTC 


AACCAAGTCA 


TTCTGAGAAT 


AGTGTATGCG 


GCGACCGAGT 


TGCTCTTGCC 


18S50 


18560 


18570 
XB9rO 


18580 

XO JO V 


18590 


18600 


-^GCGTCAAC 


ACGGGATAAT 


ACCGCGCCAC 


ATAGCAGAAC 


TTTAAAAGTG 


CTCATCATTC 


18610 


18620 


18630 


18640 


18650 


18660 


GAAAACGTTC 


TTCGGGGCGA 


AAACTCTCAA 


GGATCTTACC 


GCTGTTGAGA 


TCCAGTTCGA 


* 

18670 


18680 


IS 690 


18700 


18710 


18720 


TGTAACCCAC 


TCGTGCACCC 


AACTCATCTT 


CAGCATCTTT 


TACTTTCACC 


AGCGTTTCTG 


18730 


18740 


18758 

XO m Jw 


18760 

AO t BW 


18770 


18780 


GGTGAGCAAA 


AACAGGAAGG 


CAAAATGCCG 


CAAAAAAGGG 


AATAAGGGCG 


ACACGGAAAT 


18790 


18800 


18810 


18820 


18830 


18840 


GTTCAATACT 


CATACTCTTC 


C 1 1 II TCAAT 


ATTATTGAAG 


CATTTATCAG 


GGTTATTGTC 


18850 


18860 


18870 


18880 


18890 


18900 


TCATGAGCGG 


ATACATATTT 


GAATGTATTT 


AGAAAAATAA 


ACAAATAGGG 


GTTCCGCGCA 


18910 


18920 


18930 


18940 


18950 


18960 


"MTTCCCCG 


AAAAGTGCCA 


CCTGACGTCT 


AAGAAACCAT 


TATTATCATG 


ACATTAACCT 


18970 


18980 


18990 


19000 


19010 


19020 


ATAAAAATAG 


GCGTATCACG 


AGGCCCTTTC 


GTCTTCAAGA 
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Ht D ■ Inactive Dihydrofolate redactue SO - SV40 Origin of replication 

X m Q4V and SV40 tnhuotra 

Ht B • Inactive Smoulli Hlatidinol Debydxegenaae 

T « Berpea Simplex thymidine Jcinaa promoter and polyoma enhancer 

C • Cytomegmlovinrs promo tax/ enhancer B - Bovine growth hormone polyadenylation 
HI m neomycin phoaphotranereraee axon 1 112 » Neomycin phoaphotrana£eraee axon 2 
K ■ Btsoan kappa conatant ■ Gl ■ Hrrman Gamma 1 conatant . 

VL « Variable light chain anti-0)23 primate 5ES and leader 
VB - Variable heavy chain anti-CD23 primate 5E8H- and leader 

Msndy cut Xbal Xho I and llgated to Xba I Xho I fragment from XKG1+CD23 5E8N-8HL 

Map by Mttehelt Raft Conatructad by Karen MoLachtan 00/20/97 19,035 bp 

Nonouttara « Afllf, Avrll, Hlndltl, t-Ppol, l«S— J. Pmll, Rarll, Sgfl v Srfl 
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DNASXS 

Mandy + 5E8N-SHL 

10 20 30 40 SO 60 

TTAATTAAGG GGCGGAGAAT GGGCGGAACT GGGCGGAGTT AGGGGCGGGA TGGGCGCAGT 

70 80 90 100 110 120 

TAGGGGCGGG ACTATGGTTG CTGACTAATT GAGATGCATG CTTTGCATAC TTCTGCCTGC 

130 140 150 160 170 180 

TGGGGAGCCT GGGGACTTTC CACACCTGGT TGCTGACTAA TTGAGATGCA TGCTTTGCAT 

190 200 210 220 230 240 

ACTTCTGCCT GCTGGGGAGC CTGGGGACTT TCCACACCCT AACTGACACA CATTCCACAG 

250 260 270 280 290 300 

AATTAATTCC CCTAGTTATT AATAGTAATC AATTACGCGG TCATTAGTTC ATAGCCCATA 

310 320 330 340 350 360 

TATGGAGTTC CGCGTTACAT AACTTACGGT AAATGGCCCG CCTGGCTGAC CGCCCAACGA 

370 380 390 400 410 420 

;CCCGCCCA TTGACGTCAA TAATGACGTA TGTTCCCATA GTAACGCCAA TAGGGACTTT 

430 440 450 460 470 480 

CCATTGACGT CAATGGGTGG AGTATTTACG GTAAACTGCC CACTTGGCAG TACATCAAGT 

490 500 510 520 530 540 

GTATCATATG CCAAGTACGC CCCCTATTGA CGTCAATGAC GGTAAATGGC CCGCCTGGCA 

550 560 570 580 590 600 

TTATGCCCAG TACATGACCT TATGGGACTT TCCTACTTGG CAGTACATCT ACGTATTAGT 

610 620 630 640 650 660 

CATCGCTATT ACCATGGT6A TGCGGTTTTG GCAGTACATC AATGGGCGTG GATAGCGGTT 

670 680 690 700 710 720 

TGACTCACGG GGATTTCCAA GTCTCCACCC CATTGACGTC AATGGGAGTT TGTTTTGAAG 

730 740 750 760 770 780 

/GTTTAAAC AGCTTGGCCG GCCAGCTTTA TTTAACGTGT TTACGTCGAG TCAATTGTAC 

790 800 810 820 830 840 

ACTAACGACA GTGATGAAAG AAATACAAAA GCGCATAATA TTTTGAACGA CGTCGAACCT 

850 860 870 880 890 990 

TTATTACAAA ACAAAACACA AACGAATATC GACAAAGCTA GATTGCTGCT ACAAGATTTG 

910 920 930 940 950 960 

GCAAGTTTTG TGGCGTTGAG CGAAAATCCA TTAGATAGTC CAGCCATCGG TTCG6AAAAA 

970 980 990 1000 1010 1^20 

CAACCCTTGT TTGAAACTAA TCGAAACCTA TTTTACAAAT CTATTGAGGA TTTAATATTT 

1030 1040 1050 1060 1070 1080 

AAATTCAGAT ATAAAGACGC TGAAAATCAT TTGATTTTCG CTCTAACATA CCACCCTAAA 

1090 1100 1110 1120 1130 1140 

GATTATAAAT TTAATGAATT ATTAAAATAC ATCAGCAACT ATATATTGAT AGACATTTCC 

1150 1160 1170 1180 119 0 1200 

AGTTTGTGAT ATTAGTTTGT GCGTCTCATT ACAATGGCTG TTATTTTTAA CAACAAACAA 

1210 1220 1230 1240 12S 0 1260 

CTGCTCGCAG ACAATAGTAT AGAAAAGGGA GGTGAACTGT TTTTGTTTAA CGGTTCGTAC 

12 70 1280 1290 1300 1310 1320 

AACATTTTGG AAAGTTATGT TAATCCGGTG CTGCTAAAAA ATGGTGTAAT TGAACTAGAA 
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1330 1340 1350 1360 1370 1380 

GAAGCTGCGT ACTATGCCGG CAACATATTG TACAAAACCG ACGATCCCAA ATTCATTGAT 

1390 1400 1410 1420 1430 1440 

TATATAAATT TAATAATTAA AGCAACACAC TCCGAAGAAC TACCAGAAAA TAGCACTGTT 

1450 1460 1470 1480 1490 1500 

GTAAATTACA GAAAAACTAT GCGCAGCGGT ACTATACACC CCATTAAAAA AGACATATAT 

1510 15Z0 1530 1540 1550 1560 

ATTTATGACA ACAAAAAATT TACTCTATAC GATAGATACA TATATGGATA CGATAATAAC 

1570 1580 1590 1600 1610 1620 

TATGTTAATT TTTATGAGGA GAAAAATGAA AAAGAGAAGG AATACGAA6A AGAAGACGAC 

1630 1640 1650 1660 1670 1680 

AAGGCGTCTA GTTTATGTGA AAATAAAATT ATATTGTCGC AAATTAACTG TGAATCATTT 

1690 1700 1710 1720 1730 1740 

(sAAAATGATT TTAAATATTA CCTCAGCGAT TATAACTACG CGTTTTCAAT TATAGATAAT 

1750 1760 1770 1780 1790 1800 

ACTACAAATG TTCTTGTTGC GTTTGGTTTG TATCGTTAAT AAAAAACAAA TTTGACATTT 

1810 1820 1830 1840 1850 I860 

ATAATTGTTT TATTATTCAA TAATTACAAA TAGGATTGAG ACCCTTGCAG TTGCCAGCAA 

1870 1880 1890 1900 1910 1920 

ACGGACAGAG CTTGTCGAGG AGAGTTGTTG ATTCATTGTT TGCCTCCCTG CTGCGGTTTT 

1930 1940 1950 1960 1970 1980 

TCACCGAAGT TCATGCCAGT CCAGCGTTTT TGCAGCAGAA AAGCCGCCGA CTTCGGTTTG 

1990 2000 2010 2020 2030 2040 

CGGTCGCGAG TGAAGATCCC TTTCTTGTTA CCGCCAACGC GCAATATGCC TTGCGAGGTC 

2050 2060 2070 2080 2090 2100 

GCAAAATCGG CCAAATTCCA TACCTGTTCA CCGACGACGG CGCTGACGCG ATCAAAGACG 

2110 2120 2130 2140 2150 2160 

CGGTGATACA TATCCAGCCA T6CACACTGA TACTCTTCAC TCCACATGTC GGTCTACATT 

2170 2180 2190 2200 2210 2220 

GAGTGCAGCC CGGCTAACGT ATCCACGCCG TATTCGGTGA TGATAATCGG CTGATGCAGT 

2230 2240 2250 2260 2270 2280 

TTCTCCTCCC AGGCCAGAAG TTC II 1 1 I CC AGTACCTTCT CTGCCGTTTC CAAATCGCCG 

2290 2300 2310 2320 2330 2340 

CTTTGGACAT ACCATCCGTA ATAACGGTTC AGGCACAGCA CATCAAAGAG ATCGCTGATG 

2350 2360 2370 2380 2390 2400 

GTATCGGTGT GAGCGTCGCA GAACATTACA TTGACGCAGG TGATCGGACG CGTCGGGTCG 

2410 2420 2430 2440 2450 2460 

AGTTTACGCG TTGCTTCCGC CAGTGGCGCG AAATATTCCC GTGCACCTTG CGGACGGGTA 

2470 2480 2490 2500 2510 2520 

TCCGGTTCGT TGGCAATACT CCACATCACC ACGCTTGGGT GGTTTTTGTC ACGCGCTATC 

i 

2530 2540 2550 2560 2570 2580 

AGCTCTTTAA TCGCCTGTAA GTCCGCTTGC TGAGTTTCCC CGTTGACTGC CTC7TCGCTG 



2590 2600 2610 2620 2630 2640 
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TACAGTTCTT TCGGCTTGTT GCCCGCTTCG AAACCAATGC CTAAAGAGAG GTTAAAGCCG 

2650 2660 2670 2680 2690 2700 

ACAGCAGCAG TTTCATCAAT CACCACGATG CCATGTTCAT CTGCCCACTC GAGCATCTCT 

2710 2720 2730 2740 2750 2760 

TCAGCGTAAG GGTAATGCGA GGTACGGTAG GAGTTGGCCC CAATCCAGTC CATTAATGCG 

2770 2780 2790 2800 2810 2820 

TGGTCGTGCA CCATCAGCAC GTTATCGAAT CCTTTGCCAC GCAA6TCCGC ATCTTCATGA 

2830 2840 2850 2860 2870 2880 

CGACCAAAGC CAGTAAAGTA GAACGGTTTG TGGTTAATCA GGAACTGTTC GCCCTTCACT 

2890 2900 2910 2920 2930 2940 

GCCACTGACC GGATCCCGAC GCGAAGCGGG TAGATATCAC ACTCTGTCTG GCTTTTGGCT 

2950 2960 2970 2980 2990 3000 

TGACGCACA GTTCATAGAG ATAACCTTCA CCCGGTTGCC AGAGGTGCGG ATTCACCACT 

3010 3020 3030 3040 3050 3060 

TGCAAAGTCC CGCTAGTGCC TTGTCCAGTT GCAACCACCT GTTGATCCGC ATCACGCAGT 

3070 3080 3090 3100 3110 3120 

TCAACGCTGA CATCACCATT GGCCACCACC TGCCAGTCAA CAGACGCGTG GTTACAGTCT 

3130 3140 3150 3160 3170 3180 

TGCGCGACAT GCGTCACCAC GGTGATATCG TCCACCCAGG TGTTCGGCGT GGTGTAGAGC 

3190 3200 3210 3220 3230 3240 

ATTACGCTGC GATGGATTCC GGCATAGTTA AAGAAATCAT GGAAGTAAGA CTGCTTTTTC 

3250 3260 3270 3280 3290 3300 

TTGCCGTTTT CGTCGGTAAT CACCATTCCC GGCGGGATAG TCTGCCAGTT CAGTTCGTTG 

3310 3320 3330 3340 3350 3360 

TCACACAAA CGGTGATACC CCTCGACGGA TTAAAGACTT CAAGCGGTCA ACTATGAAGA 

3370 3380 3390 3400 3410 3420 

AGTGTTCGTC TTCGTCCCAG TAACCTATGT CTCCAGAATG TAGCCATCCA TCCTTGTCAA 

3430 3440 3450 3460 3470 3480 

TCAAGGCGTT GGTCGCTTCC GGATTGTTTA CATAACCGGA CATAATCATA GGTCCTCTGA 

3490 3500 3510 3520 3530 3540 

CACATAATTC GCCTCTCTGA TTAACGCCCA GCGTTTTCCC GGTATCCAGA TCCACAACCT 

3550 3560 3570 3580 3590 3600 

TCGCTTCAAA AAATGCAACA ACTTTACCGA CCGCGCCCGG TTTATCATCC CCCTCG6GTG 

3610 3620 3630 3640 3650 3660 

TAATCAGAAT AGCTGATGTA GTCTCAGTGA GCCCATATCC TTGTCGTATC CCTGGAACAT 

3670 3680 3690 3700 3710 3720 

GGAAGCGTTT TGCAACCGCT TCCCCGACTT CTTTCGAAAG AGGTGCGCCC CCAGAAGCAA 

3730 3740 3750 3760 3770 3780 

TTTCGTGTAA ATTAGATAAA TCGTATTTGT CAATCAGAGT GCTTTTGGCG AAGAATGAAA 

3790 3800 3810 3820 3830 3840 

ATAGGGTTGG TACTACCAAC GCACTTTGAA TTTTGTAATC CTCAACMAT CGTAAAAACA 

3850 3860 3870 3880 3890 3900 

GCTCTTCTTC AAATCTATAC ATTAAGACGA CTCGAAATCC ACATATCAAA TATCCGAGTG 
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3910 


3920 


3930 


TAGTAAACAT 


TCCAAAACCG 


TGATGGAATG 


3970 


3980 


3990 


TGATTTGATT 


GCCAAAAATA 


GGATCTCTGG 


4030 


4040 


4050 


GCGGAAGGGC 


CACACCCTTA 


GGTAACCCAG 




m 

A1 A A 


A1 1 A 


AAAGGACTCT 


GGTACAAAAT 


CGTATTCATT 


4150 


4160 


A1 7A 
•fir W 


CGTGTACATC 


GACTGAAATC 


CCTGGTAATC 


4210 


4220 




GATTATTGGT 


AA 1 ITTTTTT 


GCACCTTCAA 


4270 


4280 


4290 


.CTACGGTA 


GGCTGCGAAA 


TGTTCATACT 


4330 


4340 


4350 


GTTCGCGGCC 


GCAACTGCAA 


CTCCGATAAA 


4390 


4400 


4410 


AAGAGAGTTT 


TCACTGCATA 


CGACGATTCT 


4450 


4460 


4470 


AGCTTCTGCC 


AACCGAACGG 


ACATTTCGAA 


4510 


437A 




GGCCGGGCTT 


CAATACCCTG 


ATTGACTGGA 


4570 


4580 


4CAA 


TGCTGACGCG 


TCCGGCGATT 


TCCGCCTCTG 




4640 




jGATAATGT 


AAAAACGCGC 


GGTGACGATG 


4037C 


4/*JW 


J71 A 

4 /Xv 


AAACAGAAGT 


GACAGCGCTA 


CGCGTCACCC 


4750 


A7£A 




TGAGCGACGA 


ATTAAAACAG 


GCGATGACCG 


4S10 


4o£v 




CCGCGCAGAC 


CCTACCGCCT 


GTAGATGTGG 




4ooO 


JfAA 

4s9« 


TTAC6CGTCC 


CGTCTCGTCT 


GTCGGTCT6T 


4930 


4940 


4950 


UIALUU 1 UV> | 




A LULL (ab Lb I 


4990 


5000 


5010 


GCTCGCCGCC 


GCCCATCGCT 


GATGAAATCC 


5050 


5060 


5070 


AAATCTTTAA 


CGTCGGCGGC 


GCGCAGGCGA 


5110 


5120 


5130 


TACCCAAAGT 


GGATAAAATT 


TTTGGCCCCG 


5170 


5180 


5190 


AGGTCAGCCA 


GCGTCTCGAC 


GGCGCCGCTA 



3940 


3950 


3960 


GAACAACACT 


TAAAATCGCA 


GTATCCGGAA 


4000 


4010 


4020 


CATGCGAGAA 


TCTGACGCA6 


GCAGTTCTAT 


4060 


4070 


4080 


TAGATCCAGA 


GGAATTGTTT 


TGTCACGATC 


4120 


4130 


4140 


AAAACCGGGA 


GGTAGATGAG 


ATGTGACGAA 


4180 


4190 


4200 


CGTTTTAGAA 


TCCATGATAA 


TAATTTTCTG 


4240 


4250 


4260 


AATTTTT IGC 


AACCCCTTTT 


TGGAAACAAA 


4300 


4310 


4320 


GTTGACCAAT 


TCACGTTCAT 


TATAAAT6TC 


4360 


4370 


4380 


TAACGCGCCC 


AACACCGGCA 


TAAAGAA7T6 


4420 


4430 


4440 


GTGATTTGTA 


TTCAGCCCAT 


ATCGTTTCAT 


4480 


4490 


4500 


GTATTCCGCG 


TACAGCCCGG 


CCGTTTAAAC 


4540 


4550 


4560 


ACAGCTGTAG 


CCCTGAACAG 


CAGCGTGCGC 


4600 


4610 


4620 


ACAGTATTAC 


CCGGACGGTC 


AGCGATATTC 


4660 


4670 


4680 


CCCTGCGTGA 


ATACAGCGCT 


AAATTTGATA 


4720 


4730 


4740 


CTGAAGAGAT 


CGCCGCCGCC 


GGCGCGCGTC 


4780 


4790 


4800 


CTGCCGTCAA 


AAATATTGAA 


ACGTTCCATT 


4840 


4850 


4860 


AAACCCAGCC 


AGGCGTGCGT 


TGCCAGCAGG 


4900 


4910 


4920 


ATATTCCCGG 


CG6CTCGGCT 


CCGCTCTTCT 


WOO 


AQ7A 




Gf ATTGCGGG 


ATGCCAGAAG 


GTGGTTCTGT 


5020 


5030 


5040 


TCTATGCGGC 


GCAACTGTGT 


GGCGTGCAGG 


5080 


5090 


5100 


TTGCCGCTCT 


GGCCTTCGGC 


AGCGAGTCCG 


5140 


5150 


5160 


GCAACGCCTT 


TGTAACCGAA 


GCCAAACGTC 


5200 


5210 


5220 


TCGATATGCC 


AGCCGGGCCG 


TCTGAAGTAC 
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5230 5240 5250 5266 5270 * 5280 

TGGTGATCGC AGACAGCGGC GCAACACCGG ATTTCGTCGC TTCTGACCTG CTCTCCCAGG 

5290 5300 5310 5320 5330 5340 

CTGAGCACGG CCCGGATTCC CAGGTGATCC TGCTGACGCC TGATGCTGAC ATTGCCCGCA 

5350 5360 5370 5380 5390 5400 

AGGTGGCGGA GGCGGTAGAA CGTCAACTGG CGGAACTGCC GCGCGCGCAC ACCGCCCGGC 

5410 5420 5430 5440 5450 5460 

AGGCCCT6AG CGCCAGTCGT CTGATTGTGA CCAAAGATTT AGCGCAGTGC GTCGCCATCT 

5470 5480 5490 5500 5510 5520 

CTAATCAGTA TGGGCCGGAA CACTTAATCA TCCAGACGCG CAATGCGCGC GATTT6GTGG 

5530 5540 5550 5560 5570 5580 

ATGCGATTAC CAGCGaGGC TCGGTATTTC TCGGCGACTG GTCGCCGGAA TCCGCCGGTG 

5590 5600 5610 5620 5630 5640 

ATTACGCTTC CGGAACCAAC CATGTTTTAC CGACCTATGG CTATACTGCT ACCTGTTCCA 

5650 5660 5670 5680 5690 5700 

GCCTTGGGTT AGCGGATTTC CAGAAACGGA TGACCGTTCA GGAACTGTCG AAAGCGGGCT 

5710 5720 5730 5740 5750 5760 

TTTCCGCTCT GGCATCAACC ATTGAAACAT TGGCGGCGGC AGAACGTCTG ACCGCCCATA 

5770 5780 5790 5800 5810 5820 

AAAATGCCGT GACCCTGCCC GTAAACGCCC TCAAG6AGCA AGCATGAGa CTGAAAACAC 

5830 5840 5850 5860 5870 5880 

TCTCAGCGTC GCTGACTTAG CCCGTGAAAA TGTCCGCAAC CTG6AGATCC AGACATGGAT 

5890 5900 5910 5920 5930 5940 

*AGATACATT GATGAGTTTG GACAAACCAC AACTAGAATG CAGTGAAAAA AATGCTTTAT 

5950 5960 5970 5980 5990 6000 

TTGTGAAATT TGTGATGCTA TTGCT7TATT TGTAACCATT ATAAGCTGCA ATAAACAAGT 

6010 6020 6030 6040 6050 6660 

TAACAACAAC AATTGCATTC ATTTTATGTT TCAGGTTCAG GGGGAGGTGT GGGAGGTTTT 

6070 6080 6090 6100 6110 6120 

TTAAAGCAAG TAAAACCTCT ACAAATGTGG TATGGCTGAT TATGATCTCT AGGGCCG6CC 

6130 6140 6150 6160 6170 6180 

CTCGACGGCG CGTCTAGAGC AGTGTGGTTT TCAAGAGGAA GCAAAAAGCC TCTCCACCCA 

6190 6200 6210 6220 6230 6240 

GGCCTGGAAT GTTTCCACCC AATGTCGAGC AGTGTGGTTT TGCAAGAGGA AGCAAAAAGC 

6250 6260 6270 6280 6290 6300 

CTCTCCACCC AGGCCTGGAA TGTTTCCACC CAATGTCGAG CAAACCCCGC CCA6CGTCTT 

6310 6320 6330 6340 6350 6360 

GTCATTGGCG AATTGGAACA CGCATATGCA GTCGGGGCGG CGCGGTCCCA GGTCCACTTC 

6370 6380 6390 6400 6410 6420 

GCATATTAAG GTGGCGCGTG T6GCCTCGAA CACC6AGCGA CCCTGCAGCC AATATGG6AT 

6430 6440 6450 6460 6470 6480 

CGCCCATTGA ACAAGATGGA TTGCACGCAG GTTCTCCGGC CGCTTGGGTG 6AGAGGCTAT 

6490 6500 6510 6520 6530 6540 
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TCGGCTATGA CT6GGCACAA CAGACAATCG GCTGCTCTGA TGCCGCCGTG TTCCGGCTGT 

6550 6560 6570 6580 6590 6600 

CAGCGCAGGG GCGCCCGGTT CTTTTTGTCA AGACCGACCT GTCCGGTGCC CTGAATGAAC 

6610 6620 6630 6640 6650 6660 

TGCAGGTAAG TGCGGCCGTC GATGGCCGAG GCGGCCTCGG CCTCTGCATA AATAAAAAAA 

6670 6680 6690 6700 6710 6720 

ATTAGTCAGC CATGCATGGG GCGGAGAATG GGCGGAACTG GGCGGAGTTA GGGGCGGGAT 

6730 6740 6750 6760 6770 6780 

GGGCGGAGTT AGGGGCGGGA CTATGGTTGC TCACTAATTG AGATGCATGC TTTGCATACT 

6790 6800 6810 6820 6830 6840 

TCTGCCTGCT GGGGAGCCTG GGGACTTTCC ACACCTGGTT GCTGACTAAT TGAGATGCAT 

6850 6860 6870 6880 6890 6900 

*CTTTGCATA CTTCTGCCTG CTGGG6AGCC TGGGCACTTT CCACACCCTA ACTGACACAC 

6910 6920 6930 6940 6950 6960 

ATTCCACAGA ATTAATTCCC CTA6TTATTA ATAGTAATCA ATTACGGGGT CATTAGTTCA 

6970 6980 6990 7000 7010 7020 

TAGCCCATAT ATGGAGTTCC GCGTTACATA ACTTACGGTA AATGGCCCGC CTG6CTGACC 

7030 7040 7050 7060 7070 7080 

GCCCAACGAC CCCCGCCCAT TGACGTCAAT AATGACGTAT GTTCCCATAG TAACGCCAAT 

7090 7100 7110 7120 7130 7140 

AGGGACTTTC CATTGACGTC AATGGGTGGA GTATTTACGG TAAACTGCCC ACTTCGCAGT 

7150 7160 7170 7180 7190 7200 

ACATCAAGTG TATCATATGC CAAGTACGCC CCCTATTGAC GTCAATGACG GTAAATGGCC 

7210 7220 7230 7240 7250 7260 

^CCTGGCAT TATGCCCAGT ACATGACCTT ATGGGACTTT CCTACTTGGC AGTACATCTA 

7270 7280 7290 7300 7310 7320 

CGTATTAGTC ATCGCTATTA CCATGGTGAT GCGGTTTTGG CAGTACATCA ATGGGCGTGG 

7330 7340 7350 7360 7370 7380 

ATAGCGGTTT GACTCACGGG GATTTCCAAG TCTCCACCCC ATTGACGTCA ATGGGAGTTT 

7390 7400 7410 7420 7430 7440 

GTTTTGGCAC CAAAATCAAC GGGACTTTCC AAAATGTCGT AACAACTCCG CCCCATTGAC 

7450 7460 7470 7480 7490 7500 

GCAAATGGGC GGTAGGCGTG TACGGTGGGA GGTCTATATA AGCAGAGCTG 6GTACGTGAA 

7510 7520 7530 7540 7550 7560 

CCGTCAGATC GCCTGGAGAC GCCATCACAG ATCTCTCACC ATGGACATGA GGGTCCCCGC 

7570 7580 7590 7600 7610 7620 

TCAGCTCCTG GGGCTCCTTC TGCTCTGGCT CCCAGCTGCC AGATGTGACA TCCAGATGAC 

7630 7640 7650 7660 7670 7680 

CCAGTCTCCA TCTTCCCT6T CTGCATCTGT AGGGGACAGA GTCACCATCA CTT6CAGGGC 

7690 7700 7710 7720 7730 7740 

AAGTCAGGAC ATTAGGTATT ATTTAAATTG GTATCAGCAG AAACCAGGAA AAGCTCCTAA 



7750 7760 7770 7780 7790 7800 

GCTCCTGATC TATGTTGCAT CCAGTTTGCA AAGTGGGGTC CCATCAAGGT TCAGCGGCAG 
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7810 7820 7830 

TCGATCTGCG ACAGACTTCA CTCTCACCGT 

7870 7880 7890 

TTATTACTGT CTACAGCTTT ATAGTACCCC 

7930 7940 7950 

AATCAAACCT ACGGTGGCTG CACCATCTGT 

7990 8000 8010 

GAAATCTGGA ACTGCCTCTG TTGTGTGCCT 

8050 8060 8070 

AGTACAGTGG AAGGTGGATA ACGCCCTCCA 

8110 8120 8130 

GCAGGACAGC AAGGACAGCA CCTACAGCCT 

8170 8180 8190 

''AC GAG AAA CACAAAGTCT ACGCCTGCGA 

8230 8240 8250 

CACAAAGAGC TTCAACAGGG GAGAGTGTTG 

8290 8300 8310 

CTAGACTGGA TTCGTGACAA CATGCGGCCG 

8350 8360 8370 

GCCTTCTAGT TGCCAGCCAT CTGTTGTTTG 

8410 8420 8430 

AGGTGCCACT CCCACTGTCC TTTCCTAATA 

8470 8480 8490 

TAGGTGTCAT TCTATTCTGG GGGGT6GGGT 

8530 8540 8550 

ACAATACC AGGCATGCTG GGGATGCGGT 

8590 8600 8610 

CAGCTGGGAC TAGTCGCAAT TGGGCGGAGT 

8650 8660 8670 

GACTATGGTT GCTGACTAAT TGACATGCAT 

8710 8720 8730 

TGGGGACTTT CCACACCTGG TTGCTCACTA 

8770 8780 8790 

TCCTGGGCAG CCTGGGGACT TTCCACACCC 

8830 8840 8850 

CCCTAGTTAT TAATAGTAAT CAATTACGGG 

8890 8900 8910 

CCGCGTTACA TAACTTACGG TAAATGGCCC 

8950 8960 8970 

ATTGACCTCA ATAATGACGT ATGTTCCCAT 

9010 9020 9030 

TCAATGGGTG CAGTATTTAC GGTAAACTGC 



7840 


7850 


7860 


CAGCAGCCTG 


CAGCCTGAAG 


ATT7TGCCAC 


7900 


7910 


7920 


TCGGACGTTC 


GGCCAAGGGA 


CCAAGGTGGA 


7960 


7970 


7980 


CTTCATCTTC 


CCGCCATCTG 


ATGAGCAGTT 


8020 


8030 


8040 


GCTGAATAAC 


TTCTATCCCA 


GAGAGGCCAA 


8080 


8090 


8100 


ATCGGGTAAC 


TCCCAGCAGA 


GTGTCACAGA 


8140 


8150 


8160 


CAGCAGCACC 


CTGACGCTGA 


GCAAAGCAGA 


8200 


8210 


8220 


AGTCACCCAT 


CAGGGCCTGA 


GCTCGCCCGT 


8260 


8270 


8280 


AATTCAGATC 


CGTTAACGGT 


TACCAACTAC 


8320 


8330 


8340 


TGATATCTAC 


GTATGATCAG 


CCTCGACTGT 


8380 


8390 


8400 


CCCCTCCCCC 


GTGCCTTCCT 


TGACCCTGGA 


8440 


8450 


8460 


AAATGAGGAA 


ATTGCATCGC 


ATTGTCTGAG 


8500 


8510 


8520 


GGGGCAGGAC 


AGCAAGGGGG 


AGGATTGGGA 


8560 


8570 


8S80 


GGGCTCTATG 


GCTTCTGAGG 


CGGAAAGAAC 


8620 


8630 


8640 


TAGGGGCGGG 


ATGGGCGGAG 


TTAGGGGCGG 


8680 


8690 


8700 


GCTTTGCATA 


CTTCTGCCTG 


CTGGGGAGCC 


8740 


8750 


8760 


ATTGAGATGC 


ATGCTTTGCA 


TACTTCTGCC 


8800 


8810 


8820 


TAACTGACAC 


ACATTCCACA 


GAATTAATTC 


8860 


8870 


8880 


GTCATTAGTT 


CATAGCCCAT 


ATATG6AGTT 


8920 


8930 


8940 


GCCTGGCTGA 


CCGCCCAACG 


ACCCCCGCCC 


8980 


8990 


9000 


AGTAACGCCA 


ATAGGGACTT 


TCCATTGACG 


904O 


9050 


9060 


CCACTTGGCA 


GTACATCAAG 


TGTATCATAT 



9070 9080 9090 9100 9110 9120 

GCCAAGTACG CCCCCTATTG ACGTCAATGA CGGTAAATGG CCCGCCTGGC ATTATGCCCA 
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9130 9140 9150 9160 9170 . 9110 

GTACATGACC TTATGG6ACT TTCCTACTTG GCAGTACATC TACGTATTAC TCATCGCTGT 

9190 9200 9210 9ZZ0 9230 9240 

TACCATGGTG ATGCGCTTTT GGCAGTACAT CAATGG6CGT GGATAGCGGT TTGACTCACG 

9250 9260 9270 9280 9 290 9300 

GGGATTTCCA AGTCTCCACC CCATTGACGT CAATGGGACT TTGTTTTGGC ACCAAAATCA 

9310 9320 9330 9340 9350 9360 

ACGGGACTTT CCAAAATGTC GTAACAACTC CGCCCCATTG ACGCAAATGG GCGGTAGGCG 

9370 9380 9390 9400 9410 9420 

TGTACGGTGG GAGGTCTATA TAAGCAGAGC TGGGTACGTG AACCGTCAGA TCGCCTGGAG 

9430 9440 9450 9460 9470 9480 

ACGCCGTCGA CATGGGTTGG AGCCTCATCT TGCTCTTCCT TGTC6CTGTT GCTACGCGTG 

9490 9500 9510 9520 9530 9540 

. CCTGTCCGA GGTGCAGCTG GTGGAGTCTG GGGGCGGCTT GGCAAAGCCT GGGGGGTCCC 

9550 9560 9570 9580 9590 9600 

TGA6ACTCTC CTGCGCAGCC TCCGGGTTCA GGTTCACCTT CAATAACTAC TACATGGACT 

9610 9620 9630 9640 9650 9660 

GGGTCCGCCA GGCTCCAGGG CAGGGGCTGG AGTGGGTCTC ACGTATTAGT AGTAGTGGTG 

9670 9680 9690 9700 9710 9720 

ATCCCACATG GTACGCAGAC TCCGTGAAGG GCAGATTCAC CATCTCCAGA GAGAACGCCA 

9730 9740 9750 9760 9770 9780 

AGAACACACT GTTTCTTCAA AT6AACAGCC TGAGAGCTGA GGACACGGCT GTCTATTACT 

9790 9800 9810 9820 9830 9840 

GTGCGAGCTT GACTACAGGG TCTGACTCCT GGGGCCAGGG AGTCCTGGTC ACCGTCTCCT 

9850 9860 9870 9880 9890 9900 

lAGCTAGCAC CAAGGGCCCA TCGGTCTTCC CCCTGGCACC CTCCTCCAAG A6CACCTCTG 

9910 9920 9930 9940 9950 9960 

GGGGCACAGC GGCCCTGGGC TGCCTGGTCA AGGACTACTT CCCCGAACCG GTGACGGTGT 

9970 9980 9990 10000 10010 19020 

CGTGGAACTC AGGCGCCCTG ACCAGCGGCG TGCACACCTT CCCGGCTGTC CTACAGTCCT 

10030 10040 10950 10060 10070 10080 

CAGGACTCTA CTCCCTCAGC AGCGTGGTGA CCGTCCCCTC CAGCAGCTTG GGCACCCAGA 

10090 19100 19110 19120 10130 10140 

CCTACATCTG CAACGTGAAT CACAAGCCCA GCAACACCAA GGTGGACAAG AAAGTTGAGC 

19150 19160 19179 19180 10190 10200 

CCAAATCTTG TGACAAAACT CACACATGCC CACCGTGCCC AGCACCTGAA CTCCTGGCGG 

10210 10220 10230 10240 10250 10260 

GACCGTCAGT CTTCCTCTTC CCCCCAAAAC CCAAGGACAC CCTCAT6ATC TCCCGGACCC 

10270 10280 10290 10300 10310 10320 

CTGAGGTCAC ATGCGTGGTG GTGGACGTGA GCCACGAAGA CCCTGAGGTC AA6TTCAACT 

10330 10340 10350 10360 10370 10380 

GGTACGTGGA CGGCGTGGAG GTGCATAATG CCAAGACAAA GCCGCGGGAG GAGCAGTACA 

10399 19400 10410 10420 10430 10440 
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ACAGCACGTA CCGTGTCGTC ACCGTCCTCA CCGTCCTGCA CCAGGACTGG CTGAATG6CA 

10450 10460 10470 10480 10490 '10500 

AGGAGTACAA GTGCAAGGTC TCCAACAAAG CCCTCCCAGC CCCCATCGAG AAAACCATCT 

10510 10520 10530 10540 10550 10560 

CCAAAGCCAA AGGGCAGCCC CGACAACCAC AGCTCT1CAC CCTCCCCCCA TCCCCCGATC 

10570 10580 10590 10600 10610 10620 

AGCTGACCAA GAACCACGTC AGCCTGACCT GCCTGGTCAA AGGCTTCTAT CCCAGCGACA 

10630 10640 10650 10660 10670 10689 

TCGCCGTGGA GTGGGACAGC AATGGGCAGC CGGAGAACAA CTACAAGACC ACGCCTCCCG 

10690 10700 10710 10720 10730 19740 

TGCT6GACTC CGACGGCTCC TTCTTCCTCT ACA6CAAGCT CACCGT6CAC AAGAGCAGGT 

10750 10760 10770 10780 10790 108M 

*GCAGCAGGG GAACGTCTTC TCATGCTCCG TGATGCATGA GGCTCTGaC AACCACTACA 

10810 10820 10830 10840 10850 10860 

CGCAGAAGAG CCTCTCCCTG TCTCCGGGTA AAT6AGGATC CGTTAACGCT TACCAACTAC 

10870 10880 10890 10900 10910 10920 

CTAGACTGGA TTCGTGACAA CATGCGGCCG T6ATATCTAC GTATGATCAG CCTCGACTGT 

10930 10940 10950 10960 10970 10980 

GCCTTCTAGT TGCCAGCCAT CTGTTGTTTG CCCCTCCCCC GTGCCTTCCT TGACCCTGGA 

10990 11000 11010 11020 11030 11040 

AGGTGCCACT CCCACTGTCC TTTCCTAATA AAATGAGGAA ATTGCATCGC ATTGTCTGAG 

11050 11060 11070 11080 11090 1H00 

TAGGTGTCAT TCTATTCTGG GGGGTGGGGT GGGGaGGAC AGCAAGGGGG AGGATTGGGA 

HH0 11120 11130 11140 11150 11160 

"*CAATACC AGGCATGCTG GGGATGCGGT GGGCTCTATG GCTTCTGACG CGGAAAGAAC 

11170 11180 11190 U200 U210 11220 

CAGCT6GGGC TCGACAGCAA CGCTAGGTCG AGGCCGCTAC TAACTCTCTC CTCCCTCCTT 

11230 U240 11250 11260 11270 11280 

TTTCCTGCAG GACCAGGCAG CGCGGCTATC GTGGCTGGCC ACGACGGGCG TTCCTTGCCC 

U290 11300 11310 11320 11330 11340 

AGCTGTGCTC GACGTTGTCA CTGAAGCGGG AAGGGACTGG CTGCTATTGG GCGAAGTGCC 

^_ 113S0 11360 11370 11380 11390 11400 

GGCGCAGGAT CTCCTGTCAT CTCACCTTGC TCCTGCCGAG AAAGTATCCA TCATGGCTGA 

^..ll* 10 11420 11430 11440 11450 11460 

TGCAATCCCC CGGCTGCATA CGCTTGATCC GGCTACCTGC CCATTCCACC ACCAAGCGAA 

.^.^B 4 I° U4Se 11490 11590 11510 11520 

ACATCGCATC GAGCGA6CAC GTACTCGCAT GGAAGCCGGT CTTGTCGATC AG6AT6ATCT 

U530 11540 11550 11560 11S70 U580 

GGACGAAGAG CATCAGGGGC TCGCGCCAGC CGAACTGTTC GCCAGGTAAG TGAGCTCCAA 

11600 11610 11620 11630 11640 

TTCAAGCTCT CCAGCTACGG CGCCCAGCTA CTAGCTTTGC TTCTCAATTT CTTATTTGCA 

U650 11660 1167 0 11680 11690 11700 

TAATGAGAAA AAAAGGAAAA TTAATTTTAA CACCAATTCA CTACTTGATT GACCAAATGC 
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11710 11720 11730 11740 11750 11760 

GTTGCCAAAA ACGATGCTTT AGAGACAGTG TTCTCTCCAC AGATAACCAC AAACATTATT 

11770 11780 11790 11800 11810 11820 

CAGAGGGAGT ACCCAGAGCT GAGACTCCTA AGCCAGTGAG TGGCACAGCA TCCAGC6AGA 

11830 U840 11850 11860 11870 U880 

AATATGCTTC TCATCACCGA AGCCTGATTC CGTAGAGCCA CACCCTGGTA AG66CCAATC 

11890 11900 11910 11920 11930 11940 

TGCTCACACA GGATAGAGAG GGCAGGAGCC AGGGCAGAGC ATATAAGGTG AGGTAGGATC 

U9S0 11960 U970 11980 11990 12000 

ACTTGCTCCT CACATTTCCT TCTGACATAG TTGTGTTGGG AGCTTGGATA GCTTGG6CGG 

12010 12020 12030 12040 12050 12060 

GGGACAGCTC AGGCCTGCGA TTTCGCGCCA AACTTGACGG CAATCCTAGC GTGAAGGCTG 

1 2070 12080 12090 121O0 12110 12120 

"AG6ATTTT ATCCCCGCTG CCATCATGGT TCGACCATTG AACT6CATCG TCGCCGTGTC 

12130 12140 12150 12160 12170 12180 

.CCAAAATATG GGGATTGGCA AGAACGGAGA CCTACCCTGG CCTCCGCTCA GGAACGAGTT 

12190 12200 12210 12220 12230 12240 

CAAGTACTTC CAAAGAATGA CCACAACCTC TTCAGTGGAA GGTAAACAGA ATCTGGTGAT 

12250 12260 12270 12280 12290 12300 

TATGGGTAGG AAAACCTGGT TCTCCATTCC TGAGAAGAAT CGACCTTTAA AGGACAGAAT 

12310 12320 12330 12340 12350 12360 

TAATATAGTT CTCAGTAGAG AACTCAAAGA ACCACCACGA GGAGCTCATT TTCTTGCCAA 

12370 12380 12390 12400 12410 12420 

AAGTTTGGAT GATGCCTTAA CGTACGCGCG CCATTAAGAC TTATT6AACA ACCGGAATTG 

12430 12440 12450 12460 12470 12480 

1AAGTAAAG TAGACATGGT TTGGATAGTC G6AGGCAGTT CTGTTTACCA CGAAGCCATG 

12490 12500 12510 12520 12530 12540 

AATCAACCAG GCCACCTCAG ACTCTTTGTG ACAAGGATCA TGCAGGAATT TGAAAGTGAC 

1255 0 12560 12570 12580 12590 12600 

ACGTTTTTCC CAGAAATTCA TTTGGGGAAA TATAAACTTC TCCCAGAATA CCCAGGCGTC 

12610 12620 12630 12640 12650 12660 

CTCTCT6AGC TCCAGGA6GA AAAAGGCATC AAGTATAAGT TTCAAGTCTA CGAGAAGAAA 

12670 12680 12690 12700 12710 12 720 

GACTAACAGG AA6ATCCTTT CAAGTTCTCT 6CTCCCCTCC TAAA6CTATG CATTTTTATA 

12730 U740 12750 12760 12770 12780 

ACACCATGGG ACTTTTGCTG GCTTTAGATC AGCCTCGACT CTGCCTTCTA GTTGCCAGCC 

12790 12800 12810 12820 12830 12840 

ATCTCTTGTT TGCCCCTCCC CCGTGCCTTC CTTGACCCTG GAAGGTGCCA CTCCCACTGT 

12850 12860 12870 12880 12890 12900 

CCTTTCCTAA TAAAATGAGG AAATTGCATC GCATTGTCTG AGTAGGTGTC ATTCTATTCT 

12910 12920 12930 12940 12950 12960 

GGGGGGTGGG GTGGGGCAGG ACAGCAAGGG GGAGGATTGG GAAGACAATA GCAGGCATGC 

12970 12980 12990 13000 13010 13020 

TGGG6ATGCG GTGGGCTCTA TCGCTTCTGA GGCGGAAAGA ACCAGCTGGG GCTC6AAGCG 
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13030 13040 13050 13060 13070 13080 

6CCGCCCATT TCGCTGGTCG TCAGATGCGG GATGCCCTGG 6AC6CGGC6G GGAGCGTCAC 

13090 13100 13110 13120 13130 13140 

ACTGAGGTTT TCCGCCAGAC GCCACTGCTG CCA66CGCTG ATGTGCCCGG CITCTCACCA 

13150 13160 13170 13180 13190 13200 

TGCGGTCGCG TTCGGTTGCA CTACGCGTAC TGT6AGCCAG AGTTGCCCGG CGCTCTCCG6 

13210 13220 13230 13240 13250 13260 

CTGCGGTAGT TCAGGCAGTT CAATCAACT6 TTTACCTTGT 6GAGCGACAT CCAGAGGCAC 

13270 13280 13290 13300 13310 13320 

TTCACCGCTT GCCAGCGGCT TACCATCCAG CGCCACCATC CAGT6CAG6A GCTCGTTATC 

13330 13340 13350 13360 13370 13380 

GCTATGACGG AACAGGTATT CGCTGGTCAC TTCGATGGTT TGCCCG6ATA AACCGAACTG 

13390 13400 13410 13420 13430 13440 

wJUAACTGC TGCT6GTGTT TTGCTTCCGT CAGC6CTGGA TGCGGCGTGC GGTCGGCAAA 

13450 13460 13470 13480 13490 13500 

GACCAGACCG TTCATACAGA ACTGGCGATC GTTCGGCGTA TCGCCAAAAT CACCGCCGTA 

13510 13520 13530 13540 13550 13560 

AGCCGACCAC GGGTTGCCGT TTTCATCATA TTTAATCAGC GACTGATCCA CCCAGTCCa 

13570 13580 13590 13600 13610 _ _13620 

GACGAAGCCG CCCTGTAAAC GGGGATACTG ACGAAACGCC TGCCAGTATT TAGC6AAACC 

13630 13640 13650 13660 13670 13680 

GCCAAGACTG TTACCCATCG CGTGGGCGTA TTCGCAAAGG ATCAGCGCGC GCGTCTCTCC 

13690 13700 13710 13720 13730 137 40 

A66TAGCGAA ACCCATTTTT TGATGGACCA TTTCGGCACA GCCCG6AA6G GCT6GTCTTC 

13750 13760 13770 13780 13790 13800 

*rCCACCCGC GCGTACATCG GGCAAATAAT ATCG6TGGCC GTGCTCTCGG CTCCGCCGCC 

13810 13820 13830 13840 13850 13860 

TTCATACTGC ACCGGGCGGG AAGGATCGAC AGATTTGATC CAGCGATACA GCGCGTCGTG 

13870 13880 13890 13900 13910 _J39» 

ATTAGCGCCG TG6CCTGATT CATTCCCCAG CGACCAGATG ATCACACTCG GGT6ATTAC6 

13930 13940 13950 13960 13970 13980 

ATCGCGCTGC ACCATTC6CG TTACGCGTTC GCTCATCGCC GGTAGCCAGC GCG6ATCATC 

13990 14000 14910 14020 14030 14«40 

GGTCAGACGA TTCATTGGCA CCATGCCGTG GGTTTCAATA TTGGCTTCAT CCACCACATA 

14050 14060 14*70 14080 14090 14100 

CAGGCCGTAG CGGTCGCACA GCGTGTACCA CAGCGGATGG TTC6GATAAT GCGAACA6CG 

14110 14120 14130 14140 14150 MM 

CACGGCGTTA AAGTTGTTCT CCTTCATCAG CAG6ATATCC TGCACCATCG TCTGCTCATC 

14170 14180 14190 14200 14210 14220 

CATGACCTGA CCATGCA6AG GATGATGCTC GT6ACGGTTA ACGCCTCGAA TCA6CAACGG 

14230 14240 14250 14260 14270 14280 

CTTGCCGTTC AGCAGCAGCA GACCATTTTC AATCCGCACC TCGCGGAAAC CGACATCGCA 



14290 14300 14310 14320 14330 14340 
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GGCTTCTGCT TCAATCAGCG TCCCCTCC6C GGTGTGCAGT TCAACCACCG CACGATAGAG 

14359 14360 14370 14380 1«99 14488 

ATTCGGGATT TCGGCGCTCC ACAGTTTCGG CTTTTCGACG 7TCA6ACGTA GTGTGACGCG 

14410 14420 14430 14440 14450 14460 

ATCGGCATAA CCACCACGCT CATCGATAAT TTCACCGCCG AAAGGCGCGG TGCC6CTG6C 

14470 14480 14498 14500 14510 145Z0 

GACCTGC67T TCACCCTGCC ATAAAGAAAC TGTTICCCGT AGGTAGTCAC GCAACTC6CC 

14530 14540 14550 14560 14570 14580 

GCACATCTGA ACTTCAGCCT CCAGTACAGC GCGGCTGAAA TCATCATTAA AGCGAGTGGC 

14590 14680 14610 146Z0 14630 14648 

A A CATC G AAA TC6CT6ATTT GTGTAGTCGG TTTATGCAGC AACGA6AC6T CACGGAAAAT 

14658 14660 14670 14680 14690 14700 

rr CGCTCATC CGCCACATAT CCTGATCTTC CAGATAACTG CCGTaCTCC AGCGCAGCAC 

14710 14720 14738 14748 14758 14766 

CATCACCGCG AGGCGGTTTT CTCCGGCGCG TAAAAATGCG CTCAGGTCAA ATTCAGACGG 

14778 14788 14799 14888 14818 14828 

CAAACGACTG TCCTGGCCGT AACCGACCCA GCGCCCGTTG CACCACAGAT GAAACGCCGA 

14838 14840 14856 14866 14878 14886 

GTTAACGCCA TCAAAAATAA TTCGCGTCTG GCCTTCCTGT AGCCAGCTTT CATCAACATT 

14898 14980 14910 14920 14930 14940 

AAATGTGAGC GAGTAACAAC CCGTCGGATT CTCCGTGGGA ACAAACGGCG GATTGACCGT 

14950 14960 14979 14986 14999 15988 

AATGGCATAG GTCACGTT6G TGTAGATGGG CGCATCGTAA CCGTGCATCT GCCAGTTTGA 

15610 15020 15030 15040 15050 15060 

"GGACGACG ACAGTATCGG CCTCAGGAAG ATCGCACTCC AGCCAGCTTT CCGGCACCGC 

15070 15080 15090 15100 15110 15129 

TTCTGGTGCC GGAAACCAGG CAAAGCGCCA TTC6CCATTC AGGCTGCGCA ACTGTTGGGA 

15130 15140 15150 15160 15170 15180 

AGGGCGATCG GTGCGGGCCT CTTCGCTATT ACGCCAGCTG GCGAAAGGGG GATGTGCTGC 

15190 15200 15219 15229 15239 15249 

AACCCCATTA AGTTGGGTAA CGCCAG6CTT TTCCCAGTCA CGACGTTGTA AAACGACTTA 

15259 15269 15279 15289 15299 15399 

ATCCGTCGAG GGGCTGCCTC GAAGCAGACG ACCTTCCGTT GTGCAGCCAG CGGCGCCTGC 

15319 15329 15339 15349 15359 15369 

GCCGGTGCCC ACAATCGTGC GCGAACAAAC TAAACCAGAA CAAATTATAC CGGCGGCACC 

15379 15388 15399 15409 15419 15429 

GCCCCCACCA CCTTaCCCG TGCCTAACAT TCCAGCGCCT CCACCACCAC CACCACCATC 

15439 15449 15459 15469 15479 15489 

GATCTCTGAA TTGCCGCCCG CTCCACCAAT GCCGACGGAA CCTCAACCCG CTGCACCTTT 

15499 15599 15519 15529 15539 15549 

AGACGACAGA CAACAATTGT TGGAAGCTAT TAGAAACGAA AAAAATCGCA CTCGTCTCAG 

15559 15569 15570 15589 15599 15699 

ACCGGTCAAA CCAAAAACGG CGCCCGAAAC CAGTACAATA GTT6AGGTGC CGACTGTGTT 
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15610 15620 15630 15640 15650 15660 

6CCTAAA6AG ACATTTGA6C CTAAACCGCC GTCTGCATCA CCGCUCaC CTCCGCCTCC 

15670 15680 15690 15700 r*rJ££l 

GCCTCCGCCG CCAGCCCCGC CTGCGCCTCC ACCGATGGTA 6ATT7ATCAT CACCTCCACC 

15730 15740 15750 15760 15770 15786 

ACCGCCCCCA TTACTA6ATT TGCC6TCTGA AAT6TTACCA CCGCCTGCAC CATCGCTTTC 

15790 15800 15810 15820 15830 15840 

TAACGTGTTG TCTGAATTAA AATCGGGCAC AGTTA6ATTG AAACCCGCCC AAAAACGCCC 

15850 15860 15870 15880 15890 15900 

GCAATCAGAA ATAATTCCAA AAAGCTCAAC TACAAATTTG ATC6CG6AC6 TGTTAGCC6A 

15910 15920 15930 15940 15950 1S960 

CACAATTAAT AGGCGTCGTG TG6CTATGGC AAAATC6TCT TC6GAAGCAA CTTCTAACCA 

15970 15980 15990 16000 1W10 lf«W 

A66GTTGG GACGACGACG ATAATCGGCC TAATAAAGCT AACAC6CCCG ATCTTAAATA 

16030 16040 16050 16060 16070 16080 

JGTCCAAGCT ACTAGTGGTA CCGCTTGGCA GAACATATCC ATCGCGTCCG CCATCTCCAG 

16090 16100 16110 16120 16130 16140 

CAGCCGCACG CGGCGCATCT CGGGCAGCGT TG6GTCCTGG CCACGGGTGC CCATGATC6T 

16150 16160 16170 16180 16190 16200 

GCTCCTGTCG TTGAG6ACCC G6CTAGGCTG GCGGGGTTGC CTTACTGGTT AGCAGAATGA 

16210 16220 16230 16240 16256 „ 

ATCACCGATA CGCGAGCGAA CGTGAAGCGA CTGCTGCTGC AAAACGTCTG CGACCTGAGC 

16270 16280 16290 16300 16310 A *g» 

AACAACATGA ATG6TCTTCG GTTTCCGTGT TTCGTAAA6T CTGGAAACGC GGAA6TCAGC 

16330 16340 16350 16360 16370 ^16380 

:cctgcacc ATTATGTTCC GGATCTGCAT CGCAGGATGC TGCTGGCTAC CCTGTGGAAC 

16390 16460 16410 16420 16430 16440 

ACCTACATCT GTATTAACGA AGCGCTGGCA TTGACCCTGA GTGATmTC TCTG6TCCCG 

16450 16460 16470 16480 16490 16500 

CCG CATC CAT ACCGCCAGTT GTTTACCCTC ACAACGTTCC A6TAACCG6G CATGTTCATC 

16510 16520 16S30 16540 16550 16560 

ATCAGTAACC CGTATCGTCA CCATCCTCTC TCGTTTCATC GGTATCATTA CCCCCATGAA 

16570 16580 16590 16660 16610 16620 

CAGAAATCCC CCTTACACGG AGGCATCAGT GACCAAACAG 6AAAAAACC6 CCCTTAACAT 

16630 16640 16650 16660 16670 16680 

GGCCCGCTTT ATCAGAAGCC AGACATTAAC GCTTCTGGAG AAACTCAACG AGCTG6ACGC 

16690 16700 16710 16720 16730 ^ 16740 

GGATGAACAG GCAGACATCT GTGAATCGCT TCACGACCAC GCT6AT6AGC TTTACCGCAG 

16750 16760 16770 16780 16790 16800 

CTGCCTCGCG CGTTTCGGTG ATGACGGTGA AAACCTCTGA CACATGCAGC TCCC6GAGAC 

16810 16820 16830 16840 16850 16860 

GCTCACAGCT TGTCTCTAAG CGGATGCCGG GAGCAGACAA GCCCGTCAGG GCGCGTCAGC 

16870 16880 16890 16900 16910 16920 

GGGTGTTGGC GGGTGTCGGG GCGCAGCCAT 6ACCCAGTCA CGTA6CGATA GC6GAGTGTA 
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16930 16940 16959 16969 16979 16989 

TACTCGCTTA ACTATGCGGC ATCAGAGCAG ATTGTACTGA GAGTGCACCA TAT6CGGTGT 

16999 17900 17910 17920 17030 17049 

GAAATACCGC ACAGATGCGT AAGGA6AAAA TACCGCATCA GGCGCTCTTC CGCTTCCTCG 

17959 17969 17979 17989 17999 17199 

CTCACTGACT CGCTGCGCTC GGTCGTTCGG CTGCGGCGAG CGGTATCAGC TCACTCAAAG 

17110 17129 17139 17149 17159 17169 

GCGGTAATAC GGTTATCCAC AGAATCAGGG GATAACGCAG GAAAGAACAT GTGAGCAAAA 

17179 17189 17199 17299 17219 17229 

GCCCAGCAAA AGGCCAGGAA CCGTAAAAAG GCCGCGTT6C TG6CGTTTTT CCATAGGCTC 

17239 17249 17259 17269 17279 17289 

CGCCCCCCTG ACGAGCATCA CAAAAATCGA CGCTCAAGTC AGAGGTGGCG AAACCCGACA 
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GGCTCTGACG CTCAGTGGAA CGAAAACTCA CGTTAAGGGA 7TTTGGTCAT GAGATTATCA 

17899 17999 17919 17929 17939 17949 

AAAAGGATCT TCACCTACAT CCTTTTAAAT TAAAAATGAA GTT7TAAATC AATCTAAAGT 
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ATATATGAGT AAACTTGGTC TGACAGTTAC CAATGCTTAA TCAGTGAGGC ACCTATCTCA 
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18070 18080 18099 18199 18119 18129 
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CCtCCAACTT TATCCGCCTC CATCCAGTCT ATTAATTGTT GCCGGCAACC TAGACTAAGT 
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ACTTCGCCAG TTAATAGTTT GCGCAACGTT GTTGCCATTG CTGCAGGCAT CGTGGTCTCA 

18310 18320 18330 18340 18350 18360 

CGCTCGTCGT TTGGTATGGC TTCATTCA6C TCC6GTTCCC AAC6ATCAAG GCGAGTTACA 

18370 18380 18390 18400 1M18 184 20 

TGATCCCCCA TGTTGTGCAA AAAACC66TT A6CTCCTTCG GTCCTCCGAT CGTTCTCAGA 

18430 18440 18459 18460 18478 IMM 

AGTAAGTTGG CCGCAGTGTT ATCACTCATG GTTATCGCAG CACTGCATAA ( Kit I I ACT 
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GTCATGCCAT CCGTAA6AT6 CTTTTCTGTG ACTGGT6AGT ACTCAACCAA GTCATTCT6A 
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GAATAGTGTA TGCGGCGACC GAGTTGCTCT TGCCCGGCGT CAACACG66A TAATACC6CG 

18610 18620 18630 18640 18650 18660 

(.CACATAGCA GAACTTTAAA AGTGCTCATC ATTGGAAAAC GTTCTTCG6G GCGAAAACTC 
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TCTTCAGCAT CTTTTACTTT CACCAGCGTT TCTGGGTGAG CAAAAACAGG AAGGCAAAAT 

18790 18800 18818 18820 18830 18848 

GCCGCAAAAA AGGGAATAAG GGCGACACGG AAATGTTGAA TACTCATACT CTTCCTTTTT 

18850 18860 18870 18880 18890 18900 

CAATATTATT GAAGCATTTA TCAGGGTTAT TGTCTCATGA GCGGATACAT ATTTGAATGT 

18910 18920 18930 18940 189S0 # 18968 

ATTTAGAAAA ATAAACAAAT AGG66TTCCG CGCACATTTC CCCGAAAAGT GCCACCT6AC 

18970 18980 18990 19000 19010 19020 

GTCTAAGAAA CCATTATTAT CATGACATTA ACCTATAAAA ATAGGCGTAT CACGAGGCCC 

19030 19040 19050 19060 19978 19080 
TTTCGTCTTC AAGAA 
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